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TO THE INSTRUCTOR 


This program introduces the student to the elements of statistical reasoning and the 
manner in which this form of reasoning enters into the process of behavioral research. 
The material is presented in a carefully pre-tested sequence. Fundamental topics 
such as variables, values, and distributions are broken down into a series of small, 
sequentially organized steps. The student is led into discussion of such relatively 
complex concepts as decision rules and the probability of Type One and Type Two errors 
only after the prerequisites for his comprehension have been developed in detail. 


After dealing with data as a collection of observed values of a variable, distributions 
of values are considered along with the manner in which descriptive statistics characterize 
various features of these distributions. Formulas for the mean and variance are developed 
in detail, and in a way that clearly indicates the exact feature of'a distribution represented 


by each expression. 


The concept of a sampling distribution is the central theme of the sections on 
statistical inference. Thus, the effects of sampling procedure and sample size in 
determining distributions are discussed in detail. In addition, the role of probability 
theory in the calculation of theoretical sampling distributions is considered as a basis 


for using random sampling procedures. 


In programmed learning, each student proceeds at his own rate. In addition, the 
material is constructed so that the student actively participates in the learning process, 
supplying answers which require his understanding of that item. The answers constitute 
immediate feedback and reinforce the student's correct responses. 


Periodic tests covering the material are provided to allow the student to evaluate 
his understanding as he progresses. Two forms of these exams are included in the 


instructor's manual. 


TO THE STUDENT 


Your programmed text is quite different from an ordinary book. A program 
consists of a large number of "frames, " or numbered statements, each of which 
tells you something and asks questions about the material you have learned. The 
frames introduce new material a little at a time and review old material as needed 


to make sure you will remember it. 


You do not study a program in the same way you study a book. The program is 


designed to let you work at your own rate of speed. 


Get ready to use your program by covering the answer column at the right with 
the slider provided for you. Next, read the first frame and write the answer either 
in the blank or on a separate piece of paper. Then, move your slider down to uncover 
the answer and see if you are right. Go on in the same way to the following frames, 
checking your answer to each frame before going on to the next. Practice with the 


following frames. 


1. When a blank has nothing under it, you simply fill in 
whatever best fits the blank. For example, the day of 


week that follows Thursday is S Friday 
2. When a blank has two answers underneath it, you select 

the one that best fits. For example, a dog an is 

animal 18/1s not 


You will find that the left-hand pages of the program are upside-down and backwards. 
Pay no attention to them until you have finished all of the right-hand pages. 


Turn to Frame 1 and begin. ...... 


i 


2. 


4. 


Section I: Data 


Suppose you were a psychologist studying 

how accurately a person judges distance. You might 
place a target some distance in front of a subject and 
ask him how far away it was. You would be interested 
in the difference between the distance he reported and 
the true of the target. 


You might decide to move the target after each of the 
subject's judgements, asking him to judge each new 
distance. In your experiment, you would be collecting 
many different judgments of distance by changing or 
varying the between the subject and 
the target. 


We refer to something that does not change during an 
experiment as a constant. Since the distance between 
the subject and the target is varied or changed, it 

be a constant in the experiment we just 


would/ would not 


described. 


We refer to the opposite of a constant as a variable. 
Something that changes in an experiment is called a 
variable. In the experiment we just described, the 
distance between the subject and the target is varied; 
therefore, we would refer to the varying distance as a 


constant/ variable 


distance 


distance 


would not 


variable 
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5. 


6. 


7. 


8. 


9. 


The target distance would be called a variable in 
your experiment because this distance was 


during the experiment. 


As a psychologist, you might wish to study how 
accurately a person judges weights. You might give 
the subject a lead ball and ask him to fill up a bag with 
sand until the bag felt as heavy as the lead ball. When 
he was finished trying to match the weight of the sand 
bag and the lead ball, you could compare the 

of the bag of sand and the weight of the lead ball. 


You could repeat this procedure several times, each 
time changing or varying the size of the lead bal, In 
this experiment, the weight of the lead ball would be 


constant/ variable 


referred to as a 


The weight of the lead ball was a variable in this 


experiment since this weight was 


during the experiment. 


Suppose you conducted the experiment differently. 
Suppose you gave the subject the same lead ball each 
time. You would not expect the subject to produce a 
bag of sand weighing exactly the same amount each 
time, even though the of the lead ball 
remained the same throughout the experiment. 


If you weighed each bag of sand carefully, you would 
doubtlessly find that the weight differed slightly for 
each bag. Therefore, in this experiment, the weight 


of the lead ball was a whereas the 
constant/ variable 
weight of the bag of sand the subject produced each 


time wasa s 
constant/ variable 


changed 


weight 


variable 


changed 


weight 


constant 


variable 


uto 


Uloput*.r 


LLY 


Ra "epi UOTSTATD "p 
-əma uonoedjqns “ə 

‘əma uoreor[dnnur “q 

emt uonmppe 'e 


təm Sursn Aq pənduroə aq prnom zods ouo 
ucu) əzour Sururejqo pue erp €e Juro yo Ajtrrqeqoad ou, 


ƏAOQE ay} yo əuou 'p 

"eAnsneuxe pue BAISNTOUT "ə 
"ƏATSnTəxə Á[[enjnur pue aatjsneyxe “q 
"Əşərduroə pue eArsn[our e 


:aq 3snul yt *ooeds 
ejdures € 107 3sT[ € 9q 03 sauroojno JO JST] € 10] 1op.o ul 


“9 


*HOIOHO W'IdLL'IQnNWN 


9q pInoA soeds o[dures e 
Jo Təqurəur Aue o) u3rsse pinoə not Iequinu şsə3re/ au, 


03 dn ppe 
TIT uornqrrjsrp Ajriqeqoad e ur senmqeqo.d əy} yo Ty 


jouuvo /ueo 
"ƏATSnTəxə Á[[enjnur aq 03 pres aq ‘STEL 


Io Speəq ‘UO? € yo SSO) ay} WOI səuroəşno om} au 


*ssəoord € Se pəqrrəsəp aq uajjo ued souroojno 


Surusəəuoə Ájurejreoun sr ərəm qoruA ur ssooo1d y 


RE 


“SINV"Ig HHL NI TILA 


AI MANAA 


10. 


11. 


12. 


13, 


14, 


As a psychologist, you might be interested in how well 
people can remember things. For example, you might 
read a list of six letters of the alphabet to a subject, 
such as m, t, s, p, g, k, and then ask him to repeat this 
list. Suppose the subject said that the letters vere m, 
t s, p g, d. He would have repeated only the first 


letters correctly. 
4/5 


Suppose you then read another list of 6 letters — for 

example p. f, t, m, s, r. If subject's response was 

p f t g, f n he would have repeated only the first 
letters correctly. 


You could conduct an experiment in which you gave the 
subject many different lists of six letters, each time 
asking him to repeat as many of the letters as he could. 
While the actual list of letters would be varied in this 
experiment, the number of letters in each list would 


always be six. 


Since the number of letters in each list would always be 


constant/ variable 


Six, the number of letters would be a 
in this experiment. 


Every time you give the subject a list of letters, you 
could record how many letters he repeated correctly 
before making an error. You would call this his Score 
on each list. Therefore, if he repeated four of the six 
letters correctly before making an error, his score for 


that list would be 5 


If you read the subject the letters m, p, t, x, z, 8, and 
his response was m, p, t, x, z, d, his score on that 
list would be . 


constant 
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17, 


18. 


19. 


20. 


Since every list contains 6 letters, whenever the subject 


repeated the list without making an error, his score 
would be . The worst possible score he could 
make would be zero, which would mean he had repeated 


of the letters correctly. 


none/all 


You know, therefore, that all of the subject's scores will 


will be no greater than and no less than . 


Because the subject! s score can vary or change from 
list to list, it would be called a in this 


experiment, 


Considering his score as a variable, list below the 
possible scores that the subject could make (beginning 
with the worst score and moving in order to the best). 


We will refer to each of these possible scores from 

0 to 6 as a possible value of this variable. Thus, the 
smallest possible value of this variable is ` , and 
the largest possible value is 2 


The subject receive a score of 10 


could/ could not 


because the list contains only six letters. Therefore, 
10 a possible value of this variable. 


is/ is not 


We have used the word to refer to 
things that may change or vary during an experiment. 
We use the word , however, to refer to 


things that do not change during an experiment. 


none 


variable 


0, L 52, 73,44, 5,4 


could not 


is not 


variable 


constant 
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One of the variables we considered was distance, A 
value of that variable would be any particular distance. 
Since 10 feet is a particular distance, it would be a 

of that variable, 


Another variable we considered was weight. Values of 
that variable could be particular — — such 
as 10 pounds, 2 ounces, 3 pounds, etc. 


There are many different things which could be 

variables in an experiment, All variables, however, 

have one thing in common. They are things that may 
during an experiment. 


Since many things may be variables, it is often useful 
to give each variable a name. For example, "distance" 
is the name we used for one variable we considered, 
whereas "weight" is then of another 


variable we considered. 


Any particular distance is a value of the variable named 
"distance." Anyp weight is a value of 
the variable named "weight." 


Two feet be a value of the variable 


would/ would not 


named weight. 


In the experiment in which the subject tried to repeat a 
list of six letters, we considered a variable named 
"Score." The subject' s "score" was the number of 
letters he repeated correctly before making an error. 
The different possible of this variable were 
0, 1, 2, 3, 4, 5, 6. 


value 


weights 


change 


name 


particular 


would not 


values 
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29. 


30. 


Therefore, if the subiect" s score changed from 3 to 5, 
we would say that the of the variable 


value/ name 


had changed. 


You have learned that a is something 
that can change during an experiment. Every time a 
variable changes, it changes from one to 


another. 


Psychologists have studied how fast a rat will run 
down a narrow alley to secure food. The picture below 


shows an experimental setup you might have used if you 


were studying running speeds of rats. 


RAT RUNWAY 


value 


variable 


value 
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(continued) 


The rat is placed in one end of the runway and food is 


placed in the end of the runway. 
same/ other 


When the door is opened in front of the rat, he is free 
to move all the way to the far end of the runway to 


secure the D 


You would be interested in the time between the opening 
of the and the time when the animal 
reached the food. 


Let" s imagine that you are conducting the following 
experiment. Suppose you took a rat that normally ate 
four ounces of food a day and that had never been in the 
runway before. You begin to feed the rat only in the 
runway. At precisely the same time each day, you 
place the rat at the starting point of the runway, four 
ounces of food at the other end. Then, you open the 
door of the runway. Each day, you would record the 

between the opening of the door and 
the rat" s reaching the food. 


In this experiment, you would be interested in a 


named running time. 


variable/ constant 


The amount of food placed in the runway each day 
would be a in this experiment. 


variable/ constant 


other 


food 


door 


time 


variable 


constant 
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34. Suppose you repeated this experiment for 10 days. 
The record of your results might look like this: 


RUNNING TIME 

200 sec. 

100 sec. 

150 sec. 

80 sec. 

40 sec. 

41 sec. 

15 sec. 

10 sec. 

4 sec. 


= iə) 
< 


3 sec. 


On day 1, the rat took 200 sec. from the time the 
door opened to reach the food. On day 2, he took 


100 sec. On day 3, he took sec. 150 
35. The rat' s running time on day 10 was . 3 sec, 
36, The rat" s running time was on day 7 shorter 
longer/ shorter 
than on day 3. 
36. Running time was on day 3 than longer 
longer/ shorter 
on day 2. 
37, Instead of referring to "the variable named running 


time," we shall simply say "running-time" variable. 


Thus, the variable we call" running time" will be 


referred to as the" = s running-time 
variable, 
38. Particular running times are of the values 


"running-time variable". 
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39. 


40. 


41. 


42. 


A large part of any scientistf s work consists of 
observing the variables in experiments and making 


records of these observations. 


Psychologists are chiefly interested in behavior — 
either the behavior of human beings or the behavior of 
animals. "Therefore, if you were a psychologist, a 
large part of your work would probably consist of 
observing b and making records of 


these observations. 


You might be interested in observing how quickly a 
person could solve a mathematical problem, how 
accurately he could estimate the weight of some 
object, or how much time he spent sleeping. What- 
ever behavior you were interested in, you would 
probably make ob of this behavior 
and records of your ob . 


Each of the variables we have discussed has had 
different possible values. For example, if you tossed 
a coin in the air and it fell on one side or the other, 
the two possible results you could observe would be a 
"head" facing up ora" " facing up. These 
two possible outcomes are the possible values of the 
variable we would call "results of a coin flip, " or 
perhaps "falls of a coin." 


What if the coin were tossed three times and you 
observed that it fell with the "head" facing upwards all 
three times? The observed values of the variable "falls 
of a coin" would be "heads," "heads," and "heads." 
Even though you did not actually observe a "tail" the 
two values of the variable are still 


possible/ observed 


"heads" or "tails." 


behavior 


observations 


observations 


tail 


possible 


OLY 


"31 dag ‘ƏSTMIƏYJO 
“Sussən3 st ou yey} stsəu1odÁu 
ay} 3əəfər “səsuodsəlz 329414102 


L ucu) əzour Sage 3əəflqns əy} yi :e[my uolsTosq 


: [SON AA 


ydery Aq poqueseudar st SƏNI uorstoəp SurAO[[OJ 9uL 


8 Hdyl3 
SHSNOdSWMH LOSNHOO JO MEETİİN 
B CEU ee seeders con 


ALTIIHVHOd 


LOMTPuu Ld400V 


Surssən3 
ənd Sururmsse şuəurrədxə uorjgurturI9sIp 
əm) 10; uornqrajsrp Surpdures [v9139 109], 


V HdVu3 
S3SNOdS3H LOSHHOO JO WSHWON 
Bie Obl SX TS Eet 


ALITI VOUA 


LOSfWH Ld3290V 


Surssons 
eund Surumsse juourriedxo uorjeurumuostp 


əm) 201 uornqrajsrp Surpdures Teənəsoəq 


*udera 

yova uo o[ni uOISIO9p 2üədəyip E “TƏAƏAOU 'pojeorpur 
ƏALY ƏM -ƏQL snorAe1d ƏY} ur peqriosap uonnqınsıp 
AyyIqeqoid eures əm) əşcərpur sude1s om} SurAo[o] au, 


"yoz 
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44. 


45. 


Suppose you were conducting a telephone survey in 
which you asked each person you called whether or 

not he had been watching television. Assuming each 
person answered the question, the two possible answers 
are "yes" or "oo," Thus, the two possible values of 
the "answer" variable are" Sand" M 


Suppose four people were called in this telephone survey. 
Suppose you observed that the first person said "yes, " 


the second person said "no," the third person said "yes," 


and the fourth said "yes." The record of your 


observations would be: 


Response 
lstperson yes 


2nd person no 
3rd person yes 
4th person yes 


These four ansveers are the values 


observed/ possible 


of the variable under study. 


Imagine that you were learning how to bowl and that you 
decided to keep track of how you improved with each 
lesson. As a test of your skill you could bowl 5 balls 
following each lesson, After each ball, you could reset 
any pins you had knocked down so that exactly 10 pins 
were standing each of the five times you rolled a ball. 
The most pins you could knock down with any one ball 
would be pins. The worst you could do with 


any one ball would be to knock down pins. 


10 


yes, no 


observed 
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46. 


47. 


48. 


- down with the 4th, and 


If we consider the number of pins you knock down with 
each ball to be a variable, the smallest possible value 
of the variable would be and the largest possible 


of the variable would be 10. 
List all the possible values of the "pins-knocked-dov/n" 


variable, starting with the smallest possible value and 
ending with the largest possible value. 


— psu] 155 LUE — Tn utm ə — — — — 


Suppose after the first lesson you rolled 5 balls with the 
following results: 


PINS KNOCKED DOWN 
1st Ball 0 
2nd Ball 0 
3rd Ball 2 
4th Ball 10 
5th Ball 5 


According to this list of results, you knocked none of 
the pins down with the first ball, none down with the 
second ball, down with the 3rd ball, 

down with the 5th ball. 


Thus, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 are the 
values of the variable we are 


possible/observed 


studying (the "pins-knocked-down' variable), and 0, 0, 
2, 10, and 5 are the values of that 


possible/ observed 


variable, 


11 


value 


2, 10 


possible 


observed 
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49. 


50, 


51. 


52. 


53. 


Each observed value of a variable is referred to 

as a single observation. Therefore, in the illustration 
we just considered, the five observed values of the 
"'pins-knocked-down" variable would be referred to as 


five . 


If you were studying how fast a rat ran down an alley to 


` secure food and observed that it took him exactly 


10 seconds, this time would be an observed value of 


the "running-time" variable and be 
would/ would not 


referred to as a single observation. 


It is possible the rat could take as 1ong as 20 minutes to 
reach the food. Suppose, however, that the longest 
observation of running time was 3 minutes. Therefore, 


whereas minutes would be a possible value of the 
3 
running-time variable, minutes would be both a 
20 


possible value and an observed value. 


When a particular variable is observed or stüdied in an 
experiment, records are made of the observed values 
of this variable. These records are called data. For 
example, we just considered an experiment in which a 
rat ran down an alley to get food. The time it took for 
the rat to reach the food each day was observed and 
recorded. Our records of these running times are the 


d from that experiment. 


Another experiment we discussed dealt with a subject' s 
repeatedly attempting to match the weight of a lead ball 
by filling a bag with sand. Records of the weight of 
each bag of sand the subject produced are the 


from that experiment. 


12 


observations 


would 


20 


data. 


data 
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54. 


55. 


56. 


57. 


58. 


EC 


Notice that we referred to the values 


possible/ observed 


of the variable as the data. A list of the possible values 
of a variable would not be considered data. 


Suppose you tossed a coin two times and the observed 
value of the variable "falls of the coin" was "heads" on 
both tosses. Then "heads" and "tails" are the two 

values of the variable, whereas 
observed/ possible 


"heads" and "heads" are the two 


possible/ observed 


values, 
In the preceding frame, the list of values we would call 
"heads" and "heads"/"heads" and "tails" 


Ki Heads" and "heads" would be considered data since 
they are the values of the variable. 


possible/ observed 


data vas 


Television stations are naturally interested in which 
programs are preferred by television viewers. Imagine 
that you were hired to determine viewing preferences 

in an area in which there were only three television 
channels: Channel 5, Channel 7, and Channel 9. 
Suppose you asked a number of television viewers the 
following question: "If you had to watch only one of 

the three television stations for a week, which one 
would you select?" Acceptable answers to these 
questions would be Channel or ox tz. 
If you considered the viewer‘ s answer as a variable, 
the three possible values of the variable would be 5, 

T, and . 
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observed 


possible 


observed 


"heads" and "heads" 
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62. 


While you probably would ask many people this question 

in a real study, let us suppose you asked only three 

people and the first person selected 7, the second 

person selected 9,and the third person selected 9. The 

values 5, 7, and 9 would be the possible 


possible/ observed 


values of the "ansvver" variable, whereas the values 7, 
9, and 9 vould be the values of the observed 


variable. 


Since an observation is any observed value of a variable, 


the three observations are j , and 9. 174.9 
Your data would be the values a , and ; “ə 
since these vere the values of the observed 


observed/ possible 


"answer" variable, 


m the television survey, the answer of each of the 


three people was a but the question 
constant/ variable 


they were asked was a 


constant/ variable 


variable 


constant 


It is often useful to distinguish between continuous and 

discrete variables. If you were to count the number of 

people in a room, it would be possible to have 8 people 

or 9 people, but it would not be possible to have 85 

people. It is this characteristic of the variable named 

"number of people in a room" which makes it a discrete 

variable. The variable is discrete since there are no 

values of the variable between 8 and 9, or between 10 

and 11, or between 12 and A 13 
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63. 


(Continued) 


On the other hand, the variable we call "length" is an 
example of a continuous variable, No matter which 
particular pair of lengths you considered, it 

be possible to imagine a length between 


would/ would not 


these two lengths. For example, 8 i inches is between 


8 inches and inches. (By "between", we mean 
7/ 9 


greater than one length but less than the other. ) 


Similarly, 2 1 teet is a value of the variable called 
"length," which is b the values 2 feet and 
3 feet. In other words, 2 š feet is larger than 2 feet 


but less than 3 feet, 


A continuous variable has an unlimited number of values 

because no matter how close two values are to each 

other it always possible to imagine another 
‘is/is not 

value which would lie between them. For example, 

even though 2. 10 inches and 2. 20 inches are close 

together, inches is between them. On the 


2.15/2.25 


other hand, if the variable you are studying is discrete, 
you can find two values of the variable such that there 
is no value between them. For example, the number 
of pennies you have in your pocket would be a value of 
a variable, since you could only 
discrete/ continuous 
have in your pocket one penny, two pennies, three 
pennies, and so on, Of course, you could not have any 


number between these values. 


& E. Y weg! enge 
15 


would 


between_ 


2.15 


discrete 


vov 


000 ‘oz € 1eeu Aprenjoe prnoo sse[o ASo[ouoÁsd InoA ut 
ejdoed rem moy eururriojop o) IƏpI0 ul :səuo1 God 
-u3Tu Á1eA eau o) əfdoəd yo AjrpIq? əv) ur pəşsədəşur 
sem oym 3ST3010U94SĞ e ərəm noA esoddng “uomnqınsiıp 
Sur[dures [eorjoxoouj re[norjied e əuryəp reng 

Zem srsoujodÁAu ue qorgA ur Zog ay} 9jerjsuourop 

0) ƏATƏS [[IA SuorjeXjsn]rt SUTMOT[OF oY, *uOTjnqErjSTIp 
Sur[dures [eorja10oqj € sorjr2eds ÁAp[enjoe pə1sə1 

aq o) stsənşod4u ay} yey} zəuucur € YONS ur uor)eATəsqo 
Teyuəurrrədxə ue 3noqe uorjsenb əy} eseayd o) epqexrsop 
ATYSTY St H Joey ut “pəuryəp ÁA[[enjoe sr uornqrrstp 
Sur[dures Teənə.oəq € YOTYM 107 seseujodáuy urej192 399] 
o) əfqrssod sr 31 'uorjnqiujsrp Sur[dures [ejueurriodxo 


ue wory jt 3urjeurrjse Aq uornqrrjsrp Surpdures 


Te2rjoxoeg; eq peurejqo oA uorexjsnjpt Surpəoəzd am ur “G02 
*SIOIIO JO 
səma Spur 1e[noryred 3upgeur Jo ett oy} pue 
 uorsroəp 777777777777777777” yelnərired vəəuqəq Torero ən) 
Aptoeds o3 Ayırıqe INOA st 3ureur-uotsroop 03 Dud ern yo 
) yovordde [eorsnejs € JO uomnqrayuoo quzlroduur 13sour eur "POZ 


*SIOJIO9 JO Spur əlqissod 

OM} a Stage jo YSTI əy) seurur1ojep 3up38e3 spseuodáu 

'epna uorsroop Ur EE ə ot eee eT NO je: 
ay} moy sojeorpur ure3e əəuo uone.nsniyi Surpəo2ə:d au, -£0Z 


"ƏM 19470 ən) [TIM wey} 4402287 
əm) o1 xoeq pəddrus Surəq seynyoered əATəəyəp-uou 


1ƏAƏ7 /əzour 


ərour ur near TITA y sours “(srsəuyodÁu 
əm 3unəəfəz Aqsnoəuocrə) zorrə əuO addy, € Surseur yo 
NSI “~~ 3seSue əy} əənpozd səop 3t. “(stsayj0dAy əy} yoefer 
03 Suter Arsnoəuoıcə) 10118 oA ƏdKL e Suryeur JO gett 

ay} səənpər g ərnr uopSpoep ərrqu ‘puey 19470 əv) UQ "SOS 


64. 
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67. 


No matter how similar the weights of two objects, it would 
always be possible to imagine an object whose weight 

was less than one but greater than the other. Therefore, 
"weight' is an example of a variable. 


discrete/ continuous 


It vill not be necessary to distinguish between discrete 
and continuous variables very often. Most of the 
illustrations in this text involve discrete variables. 
Those illustrations that involve continuous variables 
are treated as if they were discrete variables. For 
example, if you measured a person' s height to the 
nearest inch, you might say he was 65 inches tall or 
66 inches tall, because you "round off" his "height" to 
the nearest inch, you would never say he is 65 2 inches 
tall In other words, you are treating the 


variable called " height" as if it 
continuous/ discrete 
vere a variable. In other words, 


discrete/ continuous 


you are pretending that there are no values between 
65 inches and 66 inches, or between 67 inches and 


68 inches, and so on. 


To determine the number of people in a room, you 
would count them. To determine the number of eggs 
in a basket, you would count them. "Whenever you 

things, you are determining how many 
(what number) of things there are. 


Any collection or group of things can be counted. The 
procedure we call counting tells you the number of 
things that are in the group or collection. For example, 
we can determine the of windows in 


a building by counting them. 
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“ük 


It is often useful to distinguish betvveen the name of 
something and the thing itself. A "name" is something 
you speak, write, or read. For example, "Boston" is 
the name of a city. You could go to Boston, walk through 
the streets of Boston, or even live in Boston. On the 
other hand, the name "Boston" is something you speak, 
write, or read. Similarly, if you had a dog named 


"Rover, " you might scratch the behind the ears, 
dog/ name 
whereas you might print his on his dog house. 
dog/ name 


Suppose a mother didn"t decide to name her nev/ baby 
"Kendall" until three days after the baby was born. The 
would be three days old before it was given a 


baby/name 


name. 


We will use quote, " ", when we are referring to the name 
of something rather than to the thing itself. Thus, in the 
previous example we would refer to the baby as Kendall 

and to the name the baby received as "Kendall." Similarly, 
when we speak of the city of Boston and of the name 


Boston/ "Boston" 


and that you write the name as part of an 


"Boston"/Boston 


"Boston, "we will say that you could live in 


address on an envelope. 


The number of things in a group or collection is a 
characteristic of that group of things, just as the color of 
a person's hair is a characteristic of that person. We 
refer to a person's hair color with a name, such as "red," 
"brown, " and "black." Similarly, ve refer to the number 
of things in a group with a name, such as "five," "twenty, " 
"eighteen, " and "forty." Thus, "red" would be the name 
of a particular hair color and "five" would be the 


of a particular number. 
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Names for other are "two, " "6," numbers 


"four, " and "one hundred. " 


There are different ways of representing the same 

number. For example, the of number 
players on a basketball team could be represented 

either by the name "five' or by the name "5" or by 

the name "V." 


Number is a characteristic of a group or a collection 

of things. The names (such as "ten, " "four, " "6") 

that are used to represent this characteristic are often 

called numerals. Therefore, the number of players 

on a basketball team could be represented either by 

the numeral"5" or the Roman numeral . V 


Since the names "red," "green," "Democrat," and 

"Boston" are not the names of numbers, they are not 

numerals. Therefore, of the two names "6" and "blue," 

" " js a numeral since it is the name of a 6 
number 


We pointed out earlier that there is a difference 

between the name of something and the thing itself. 

A numeral is the name we give to a number. You 

determine the of things in a group number 


number/numeral 


by counting them; you represent this characteristic of 

the group by writing or saying a e numeral 
“number / numeral 

The same number can be represented by different 

numerals. "Four," "4," and "IV" are three names for 

the same number, in other words, they are three 


numerals vhich represent the same ; number 
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Number is often referred to as a variable — for 
example, the number of autoinobile accidents in 
California each year, the number of people v/ho vote 
in an election, or the number of base hits a baseball 
player makes each game. In each of these cases, 
number could be thought of as a 


Like any other variable, the variable we call number 
has different particular values. Just as "red" and 
"greem' are names of particular values of the variable 
we call "color," — "three, " "4," "20, " and "ten" are 
n of particular values of the variable we 


call number. 


Consider the following list of letters: b, c, m, p. We 
could determine the of letters in 
this list by counting how many there were. The name 
(numeral) we would use to represent this number would 


usually be" m 
"Four" and"4" are two names for the same z 


Numerals are simply names we use to represent 
different values of the variable we call 5 


Roman numerals are names for particular numbers. 
Thus, the Roman numeral HI represents the number 


" 


we usually represent by the numeral" . 


Numerals are often used to represent characteristics 
of things other than number — such as weight, length, 
temperature, age, and so on. When we say that the 


temperature is "20 degrees below zero, " we are using 
the "20" as a name for a particular 


number / numeral 


value of the variable we call temperature. 
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85. When we use the names "twenty inches," "two feet," and 
"ten feet," we are using the numerals 


numerals/numbers 


"twenty," "2," and "10" as names for particular values 
of the variable called length. 


86. Football and baseball players often have 
written on the backs of their numerals 


numerals/ numbers 


uniforms to help people in the stands to identify the 
different players. 


87. It is important to be careful when numerals are used 
to represent characteristics other than number. It 
makes sense to say that twenty things are twice as 
many things as ten things. It is also appropriate to 
say that something weighing twenty pounds is twice 
as heavy as something weighing ten pounds. It 
necessarily make sense to say that a does not 


does/ does not 


baseball player with the numeral "20" written on his 
jersey is twice as good as the baseball player with the 
numeral"10" on his jersey. The fact that one player 
was assigned the numeral" 10" and another assigned 
the numeral"20" does not necessarily imply anything 
more about the players themselves than does the 
difference in their names (John and Charles, for 


example). 


88. The value of a variable can often be represented by 
numerals. We shall refer to these variables as 
numerical variables. For example, we discussed an 
experiment earlier in which we were interested in the 
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time it took a rat to run down an alley to reach food. 
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(Continued) 


The values of this running-time variable were 
represented by numerals such as "20 seconds," 
"10 seconds," "800 seconds," Therefore, we 


refer to running time as a numerical 
would/ would not 


variable. 


Variables whose values are not represented by 
numerals will be called non-numerical variables. If 
we were interested in hair color, therefore, the values 
we might observe would be black, red, brown, and so 
forth. Since these values are not represented by 
numerals, we will refer to hair color as a 


variable. 


numerical/ non-numerical 


Political party is a variable, and particular values of 
this variable are Democrat, Republican, Socialist, and 


so on. This would be an example of a 
variable. 


numerical/ non-numerical 
The age of American presidents when they were elected 


numerical/ non-numerical 


to office would normally be a 


variable. 


We said earlier that the data from a study was a record 
of the observed values of the variables being studied, 
If these observed values are represented by numerals, 


numerical/ non-numerical 


variable. We would refer to the data, therefore, as 


the variable under study is a 


numerical data. 
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93, Records of the observed running times in the 
experiment in which we observed the time it took 
for a rat to reach some food vould be 
data. numerical 


numerical/ non-numerical 


94. Earlier, we considered a study in which people were 
asked which of three television channels they preferred: 
5, 7, or 9. We could name the three possible answers 
they gave as "5," "7," or "9." The names "5/ "7," 
and "9" are used simply as names for their answers. 
However, "5," "7," and "9" are also used as names 
for the values of the variable we call number and are 
therefore called numerals. Even though the variable 
we call "their answer" is not really a number, its 


values are represented by * numerals 


numerals/numbers 


95. Because we said that any variable whose values could 
be represented by numerals would be called a 
numerical variable, we would say that the person" s 
answer in the television survey was a 
variable, even though the numerical 
numerical/ non-numerical 
numerals we use as names for the values of this 


variable represent different numbers in this do not 
do/ do not 


particular case. 


96. Because the list of the observed values of the answer 
variable would be a list of numerals, it would be an 
example of numerical d 5 data 
97. A list of the possible values of answers in the television 
survey be called numerical data hovever. would not 


would/would not 
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98. Such a list would not be called numerical data because, 
although it is a list of values represented by numerals 


(which makes it a numerical list), it is not a list of 
observed values. Only a list of values observed 
is referred to as data. 


99. Earlier, vve listed hov many pins a person knocked 
down each time he rolled a bowling ball. Each time he 
rolled the ball he could knock down anywhere from 
0 to 10 pins. Thus, any number between 0 and 10 was 
a(n) value of the variable" pins a possible 


possible/ observed 


knocked down." 


100. Each value of the variable "pins knocked down" was 
represented by a numeral. Therefore, "pins knocked 


down" is an example of a numerical 


numerical/ non-numerical 


variable. 


The numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 are the 
values of that variable. possible 


101. The list 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 would not 
would/ would not 
be considered data because it is simply a list of the 
values of the data. possible 


possible/ observed 


102. The list of observed values of the variable used in the 
earlier illustration was 0, 0, 2, 10, and 5. This list 
be considered data. would 


would/ would not 
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104. 


105. 


106. 


This list of observed values be would 


"would/ would not 
considered numerical data, since each value of the 
variable (the number of pins knocked down) is 
represented by a z numeral 


One way in which men differ from one another is 
whether or not they have a beard. Suppose you were 
interested in how many men wore beards. If you went 
to a busy intersection and made a list of whether or not 
each man who passed wore a beard, your list might look 
like this after 5 people had passed: 


PERSON BEARD 
1 yes 
2 no 
3 yes 
4 no 
5 yes 


Since this list is a record of your observations, it 
be referred to as data. could 


could/ could not 


Here we simply recorded the values of the variable 

We were studying as" "or".  " rather than yes, no 
writing out "He wore a beard" or "He didn" t wear a 

beard." 
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107. 


108. 


109. 


110. 


When we recorded the observations of whether or not a 
man wore a beard, we could have used the numerals 
"1" or "0" instead of "yes" or "no." We could have 
recorded a "1" for each man who had a beard and a 
"0" for each man who didn't have a beard. Thus, the 
list of observations we just presented would look like 
this: 

PERSON BEARD 


1 1 
2 0 
3 1 
4 0 
5 (?) 


Since the fifth person we saw had a beard, we would 
complete our list by replacing the question mark with 


a . 


1/0 


Because we represented the values of the variable ve 
are studying with numerals, our data is 


numerical/ non-numerical 


Earlier, vve stated that is the 


number/ numeral 


characteristic of a group or collection of things which 
we determine by counting how many things are in the 
collection. 


The numerals "1" and "0" are usually used as names 
for particular numbers. However, when we use these 
numerals as names for the two values of the variable 


"bearded or beardless," they represent 


do/ do not 


the characteristic we call number. 
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114. 
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117. 


Thus, we have just seen a case where our data was 
numerical but the numerals didn" t have anything to 
do with the characteristic we call number. The 

were simply used as names for the 


numerals/ numbers 


different values of the variable vee vere studying: "1" 
if he had a beard or "0" if he didn't. 


When we talk about number, we can say such things as 
"three people are a greater number of people than 


two/ten 


people." 


We can say the number represented by the numeral "4" 
is twice as large as the number represented by the 


numeral" Lü 


VVe can also say that since 7 minus 5 equals 2, and 
3 minus 1 equals 2, the difference betveen 7 and 5 is 
the same as the difference between 3 and 3 


Sometimes when numerals are used to represent 
variables other than number, we can make similar 
Statements. For example, it would make sense to say 
that a temperature of 100 degrees above zero was 


greater (or hotter) than a temperature of degrees 


50/150 
above zero. 


It would also make sense to say that "4" pounds was 


twice as heavy as" " pounds. 


It would also make sense to say that the difference in 
height between a person who was 5 i feet tall and one 
who was 6 feet tall was the same as the difference 
between someone who was 4 $ feet tall and someone who 


was feet tall, since the difference in height was 
5/1 


1/2 foot in both cases. Ge 


numerals 


50 
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123. 


We have seen cases, however, where numerals have been 
used to represent values of a variable so that statements 
of this sort are not appropriate. For example, when we 
considered asking people to choose their favorite from 
among three television channels, we represented the values 
of the answer variable with the numerals "5," "7, " and 
nəzər make much sense to say that 


would/would not 


choosing Channel 7 was greater than choosing Channel 5. 


Also, it be appropriate to say that the 


would/would not 
difference between ansvering "5" and ansveering "7" was 
the same as the difference between answering "7" and 


answering "9" because 7 - 5 = 9 - 7. 


You should NOT assume that just because a variable is 
numerical it is similar to "number" in some vay. It 


possible for a variable to be numerical and have 
is/is not 
nothing more in common with "number" than the use of 
numerals as names for its values. 


A variable is numerical simply because we use 


as names for values of the variable. 


Later, we will consider in greater detail the use of 
numerals to represent variables other than number. For 
the present, we only wish to emphasize that statements like 
"greater than, " "the same difference as, " and "twice as 
much as" appropriate or make 


are alvays/ may not be 


sense — even when a variable is represented numerically. 


Many variables of interest to a scientist are similar in 
some way to the variable we call "number." For example, 
the variable called "length" is similar to "number," so it 
would make sense to say something ten inches long is 


longer than something inches long. 
5/12 
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We determine the number of a collection of objects by 
counting them. We determine the length of something 
in a variety of ways, although the most common 
procedure is to use a ruler. Both counting and the use 
of a ruler are procedures by which we determine the 
appropriate numeral to represent a value of a particular 
variable. Procedures of this sort are called 
measurement procedures. Another example of a 

m procedure would be the use of 


Scales to determine a person's weight. 


` The numeral 10 represents a number 


larger/ smaller 


than does the numeral 5, and ten ounces represents a 
weight than five ounces. Therefore, it 


larger/ smaller 


is possible to make statements about values of the 
variable "weight" similiar to the statement you made 
about the variable "number." For example, you could 
say the number 20 is twice as large as the number 10, 
just as a weight of 20 pounds is as 
heavy as a weight of 10 pounds. 


On the other hand, some variables are not at all 
similar to the variable called "number," even though 
you could use numerals to represent values of these 
variables. For example, the fact that the license 
number on your automobile is larger than the license 
number on your neighbor! s automobile probably 


indicate any difference between the two 
does/ does not 


automobiles (although it might indicate you obtained 
your license number on an earlier date than did your 
neighbor). 


28 


measurement 


larger 


larger 


twice 


does not 


TSP 


* (at peler 

pınous not uəuA strseujodÁqu eu 32efox o) [rey SABMTS 

OMI, adh, pınou nos) T 0} 101189 FEET ür on 
* Suryeur yo AjrrIqeqoad əy} peseexour savy prnom 

noA “0 o) 10.119 əuO əd4L * yo Aryiqeqosd əv) poonpoa 

pey nof əya “ATeurroy Ərour pojejg -*xoIS Á[[enjoe SEA 

əv uəuA [TOM SEA ou Jey} srsəuqod/u əm) 32efox 0} Summer 

JO T Jo Ayrtqeqoad e savy prom no ‘stseyjodAy 

ən) pəşəələr zəAəu NOA Jt [TOM SEA [eurrue ou Fe un 


sısəu?od4u ay} 19ə[ər KA[snoəuorzzə zəAəU pınoA noÁ ƏM 


*enij Á[[enjo? sex 31 uauA sisəvqodAq 


əm) şəəfər reAeu p[noa UY} no4 IIMS ƏM SEA 


polar /jdo00e 


3əəfər Teurrue ay} yey} srseujodAu ayy TƏAƏU 
Pinon nof vərqau 03 3urpioooe “ə[nr vöysyəəp e Sursn 

Aq 101.19 eu() əd4L € Surxeur yo Aşıyıqıssod ITE 91eururrqo 

prnoo nod yey} 1eo[o 4nəəydəd si r ‘puey roujo am uo 


*Sjerred ,,T zo Q,, uey} Aruonboag 88e[ sşəffəd o 
əumsuoə (ta [eurrue Auyisəq € US ‘(TƏM SEA [eurrue Ə) 
yey} sısəqşod4q om Sunəələr A[guo1A) rzorzrə ou() addy, 


pəzəAoT /posvo1our 


paramo, e Suryeu yo Ajt[rqeqoad am aaey 
noA "an sem rewrue əv) jeu stsəuyodKq əy} 199[ər noA 
əsoyəq uorueArəsqo Tensnun əzour uoAo ue SUTAINber Ag 


" (ərnr pro ən) sagen game) g udez5 ur st 3r wey} (oa 
ISIC] /1ej[eus 
Ae[[eurs MOU ou) sejerjsnir: qəra) V ydeay ur 
ST Stsoyjoddy əv) Surgoəífər f[SuoixA Jo KjtrrIqeqoad əy} 
TM SEA [eurrue am ey} srseqjodÁuy əy} şəəfər 0} 1opro ur 
SMI plo ən) Aq pairnboa sem uey} [eurrue Ameəq e? 107 


Tensnun əvour uə4ə ag o) Zep 3993 au uo uondumsuoə 
pooy s, Teurrue əy} sərmbər ena uols1]oəp mou au], 


"T91 


“$9T 


“cor 


"T9T 


126. 


127. 


Whenever a variable is similar to the variable ve call 
"number," it is possible to assign numerals to represent 
values of that variable in such a vay that you maintain 
İhe similarity betvveen that variable and the variable 
"number." Determining how similar a particular 
variable is to the variable vee call "number," and 
assigning numerals in the appropriate manner is called 
measurement. VVe v/ill not have time to consider the 
topic of measurement in this program. The major point 
we wish to emphasize is that you should be careful not 
to suppose a particular variable is similar to "number" 
just because that particular variable is numerical. 
Numerals are often used to represent values of a 
variable simply because numerals are convenient or 
familiar names. Just because one value of a variable 

is represented by the numeral 8 and another value of 
the variable is represented by the numeral 4 


necessarily imply that one value is in 
does/ does not 


any sense larger or greater than the other one. 


In this section, we have distinguished between "numeral" 
and "number" in order to indicate how easy it is to 
convert a non-numerical variable into a numerical one 

by simply substituting numerals for the non-numerical 
names of the values. On the other hand, in most of 

your reading you will find no distinction is made between 
"number" and "numeral." Therefore, throughout the 
remainder of this program we will use the word "number" 
without attempting to distinguish between "number" and 
"numeral" Remember, however, that using "numerals" 
to represent the values of a variable does not insure any 
similarity between that variable and the variable called 
"number." The procedure of deciding how similar a 
variable is to number and assigning numerals in an 
appropriate fashion to represent this similarity is called 


m . 
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129. 


It is often useful to list things underneath one another, 
in vhat is called a column, The following list of 
numerals is arranged in a column. 

8 


6 
5 
2 
9 


The following list of colors is also arranged in a 


c e column 
red 
green 
blue 
green 


Another way of listing things is side by side, in what is 
called a row. The same list of numerals we just 
arranged in a column could also be arranged in a row, 


as follows: 


8, 6, 5, 2, 9 


The same list of colors we just arranged in a column 
could be arranged in the following e row 


row/ column 


red, green, blue, green 
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130. Since a column is a list of things one underneath the 
other and a row is a list of things arranged side by 
side, these numerals are listed in a and row 
row/ column 


238; 9;=:6,- 3 


these days are listed in a S column 
‘row/ column ` 
Monday 
Tuesday 
Wednesday 
Thursday 
Friday 


131. Earlier, we considered an experiment in which we 
observed how fast a rat ran down an alley to reach food 
on ten successive days. The following data was 
presented as an example of what we might have observed: 


iə) 
> 
< 


RUNNING TIME 


200 sec. 
100 sec. 
150 sec. 
80 sec. 
40 sec. 
41 sec. 
15 sec. 
10 sec. 


0 


4 sec. 


3 sec. 


= 
° 


The different observed values of the running time 


variable were listed in a S column 


row/ column 
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132. We can also list things in a pair of columns placed 
side by side. For example, consider the following: 


COLUMN COLUMN 
ONE TWO 
red brown 
green orange 
blue pink 


Thus, the colors appearing in the first column are red, 
green, and blue and those appearing in the second 


column are A and - brown, orange, 
pink 


133. We could also think of the same arrangement of colors 
as being made up of three rows. For example: 


ROW ONE 


ROW THREE 


Thus, the colors appearing in the first row are red and 
brown, while the colors appearing in the TOV third 
are blue and pink. 


134. A table is an arrangement of things in rows and columns. 
The arrangement of colors we just considered is a 


table because it was formed by rows and 3,2 
2/3 2/3 


columns. 
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136. 


137. 


The table of numerals shown below consists of 
2/3 


columns and rows. 


Notice in the table of numerals that the numeral 8 is 
located in the first row AND the first column. The 


numeral 6 is also located in the first column but in the 


row. 


first/ second 


Every numeral in the table falls in a particular 
combination of row and column. For example, the 
numeral is located at the intersection of the 


third column and the first row. 


The numeral 5 is located in the row and 


the column, 
Notice that the numeral 4 is located in the first row, 
second column and also in the row, 


column, 


In each of the tables we have seen, "row one" was the 


row on the and "column one" was the 
top/ bottom 
column on the . This is the way we will 


left/ right 


always number the rows and columns in a table. 
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second 
second 


second 
third 


top 


left 


IA) 


"StseujodÁq enn uru şüəşsisuoəun 

SE Donat no£ YIM paA1esqo uəəq peu şüəAə Tensnun 

ue əsneəəq stsayjodéy ue şəələr f[snoeuo1ie 0) 3əədxə 
prnom no4 saurr yo uorjy1odoad am eururjojop ueo NOs 
“SDI0A Zom0 up * 10120 9uQ edÁ] € Surxeur yo A1trtqeqoad 
oY} eurur1ojop uojjo UVI no£ mem sr ejna uolg]oəp 
TS0497)839 € yo sərşsirəşəcreyə 3uej10dur )sour əm yo əuo 


3oefex “rey s pue 
91* Spiom Zo OM} am BOUTS örə OMT, edÁT, € st pəfəx 


polar Prnoys no2 uəuA 39e[ez o [fe oL - ST 
prom əx) əT8ulg ay} oours 10119 ƏuO edÁT, z st 3, uprnous 

nok uəuA 39e[ex o) ent "ədA) əəey proq ur şəəfər o 

IFB} Sp1oA Zog 04) əv) seq eurer snorAa4d əv) ur 101.10 

OMT ƏdÁL eur, ad proq ur 3əəfəz prom fay əTSuis am 

SEU əureıy snorAo1d ƏY} ut 10119 ƏuO edÁT əm mem əənox 


om L /əuo 


OA “dördə 
ƏdÁL e əyeui nod aerer Á[I€81 st 31 gen StSeujod4u 

OMT /əuo 
euo ut eler o) Trey noA yi “örə edÁT, € ayeur 


noA ann frai SI 31 uƏUA Srseqodáy ue 39e[e1 no £ JI 


OA /əuo 
euo "dördə ƏdAL ? ag prnom 


SEIT snorAəzd ən) ur Suryeur oq p[noA no£ özdə eur 


WOU 


‘EST 


“OST 


AST 


138. 


139. 


İt is often useful to present or record data in the form 
of a table. Suppose you vere interested in hov a 
person who took a three-day trip spent his money for 
food. You could record your observations of this 
"food cost" variable in a table like the one that follows. 


THURSDAY SATURDAY 


$1. 00 $1. 15 
| &« | s | 


Breakfast $1. 50 


Dinner $5. 00 


Notice that instead of numbering rows and columns, we 


have identified the with the various meals 
rows/ columns 
of the day, while the are identified with 


rows/ columns 


particular days of the trip. The names of the meals 
and the names of the days identify the names of the data 
in the columns and the rows. They are not themselves 
part of the data, since you have written 


could/ could not 


these row and column headings before you actually made 


the observations. 


The row and column headings identify the data; they 
part of the data. Since "breakfast" is the 


are/ are not . 


row heading of the first row and "Friday" is the column 
heading of the second column, the entry in the first row 
and second column of the data ($1. 15) is the amount 


spent for on ° 
breakfast/ lunch Thursday/ Friday 


The amount spent for dinner on Friday vas $ ; 
while the amount spent for on 
was $5. 00. 
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rows 


columns 


could 


are not 


breakfast, Frid: 


4.15 
dinner, Saturday 
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140. 


141. 


How much money was spent for all three meals on 
Saturday ? $ 8. 00 


How much was spent for all three lunches? $ 7. 00 


The day he spent the most money on breakfast, lunch 
and dinner vas . The most expensive Friday 


meal on each of the three days was ° dinner 


You may have heard of a psychology test called the 
Rorsehach test, atest in which people are asked to 
report their impressions of ink blots, like the one 


shown below: 


This ink blot was formed by spilling ink onto the center 
of a piece of paper and then folding the paper in half so 
that the ink is squeezed or blotted between the folds of 
the paper. When Dr. Rorschach invented this test, he 
believed the shape of the blot was so vague that a 
personf s answers would reflect as much about the 
person as about the ink blot. Suppose you were 
developing a test of this sort. You might make four 
different ink blots and show all of them to ten people. 
You could ask each of the people to report whether 
they thought a particular ink blot gave them an 
impression of something that was "pleasant, " "neutral," 
or "unpleasant." Thus, their answer would bea 
variable whose three non-numerical 


nu merical/ non-numerical 


possible values were" s pleasant 
1 " and" si neutral, 
unpleasant 
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145. 


You could record the answers of the subjects in a table 
which might look like this: 


SUB- | INK BLOT 
JECT Á 


pleasant 
neutral 
neutral 
neutral 
pleasant 
unpleasant 
neutral 
pleasant 


1 
2 
3 
4 
5 
6 
7 
8 
9 


neutral 


= 
° 


pleasant 


INK BLOT 
B 


Notice that "INK BLOT A" isa 


"SUBJECT 1" isa 


pleasant 
pleasant 
pleasant 
pleasant 
pleasant 
unpleasant 
pleasant 
pleasant 
pleasant 
pleasant 


heading and 


pleasant neutral 
neutral neutral 
pleasant neutral 
unpleasant | neutral 
neutral pleasant 
unpleasant | unpleasant 
neutral neutral 
neutral neutral 
neutral neutral 
pleasant pleasant 
row/ column 
heading. 


row/ column 


The row and column headings 


the terms "pleasant," "neutral," "unpleasant" in the 


data, whereas 


are/ are not 


table data (since they are observed values 


are/ are not 


of the "answer" variable). 


The headings identify who made the answer, 


row/column 


while the 


headings identify the ink blot 
row/ column 


being shown when the answer was made. 
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146. All the answers in the first column of the table were 


made in response to ink blot whereas all A 
, A/D 
the answers in row 1 vere made by subject R 1 
10 
Subject 1 thought blot A was pleasant, blot B was 
pleasant, blot C was , and blot D was neutral 
Subject 4 thought blot B was pleasant 
unpleasant, but he thought blot was pleasant. D 
Subject 7 thought blot D was T pleasant 
147. Since the answers from all ten subject to ink blot A are 
in the first column, we could determine how many 
subject reported ink blot A was "pleasant" by counting 
all the occurrences of " " in the pleasant, first 
column. 
148. Four subjects reported ink blot A seemed "pleasant, " 
whereas subjects reported that ink blot B three 
seemed "pleasant." 
149. Subject 1 felt two of the blots were pleasant, whereas 
subject 2 saw only of the blots as pleasant. one 
150. What if you were told that one of your subjects was 
severely depressed and had been under treatment by a 
psychotherapist. You might expect someone who was 
very depressed or sad to find ink blots more 
more/less 
"unpleasant" than would a typical subject. 
151. You would compare different of the rows 


rows/columns 
table in order to compare how different subjects 


reacted to the ink blots. 
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152. 


153. 


154. 


155. 


If you were looking for a subject who seemed to find 

an unusual number of the blots unpleasant, you would 

probably pick subject since this subject found 
of the blots unpleasant. 


There might be something about particular ink blots 
which really does make them seem more or less 
pleasant. 1f you had to pick out the ink blot which 
might really appear more pleasant than the rest, you 
would probably pick ink blot ,since more subjects 
found this ink blot pleasant in terms of the other three 
ink blots. 


Notice in the table of data from the ink blot study 

how the column headings refer to particular ink blots. 
We could think of each of these four ink blots as a 
particular value of a variable, which we might call the 
"ink-blot" variable. Thus, each column of the table 
would be identified with a particular of the 
"ink-blot" variable. 


Similarly, each row is identified with a particular 
subject. We could think of the subject number as a 
particular value of a variable we might call the 
"subject" variable. Thus, a particular row would be 
identified with a particular of the 
"subject" variable. 
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156. When we draw the table, we might include the 
name of the variable whose particular values are 
associated with each row. For example, consider the 
following table: 


In this table, the ink blots can be thought of as a 
variable, whose particular values are À, B, ; C 
and . D 


Each value of the ink-blot variable is identified by a 
particular column 


column/ row 


157. You could also think of the subject to whom the ink 
blot was presented as a variable. Therefore, each 
particular subject is a particular of that value 
variable. The word "subject" in the table we have just 
seen refers to a variable, and the numerals 1, 2, 3, and 


4 identify particular of that variable. values 
, 158. The table is useful in organizing the data in terms of 
the subject variable and the ink-blot variable. The 
are associated with particular ink blots columns 
rows/ columns 
and the are associated with the rows 


rows/ columns 


particular subjects. 
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159. 


160. 


161. 


If you were interested in comparing one subject! s 
responses with another subject’ s responses, you would 
compare + On the other hand, if you 


rows/ columns 


were interested in comparing the responses to a 
particular ink blot with the responses to another ink 


rows/ columns 


blot, you would compare 


For example, in the table we just saw, it is a simple 
matter to determine which ink blot was found 
unpleasant by the most subjects. Since you are 
comparing different ink blots, you would compare 
different + The column that contains 
the most "unpleasant" responses is . On the 
other hand, to find which subject made the most 
"neutral' responses, you would compare different 

. You would find that subject `  — 


rows/ columns 


was the one who had made the most "neutral" responses. 


Suppose you were interested in comparing the grades 
made by four students in three different courses. We 
could arrange their grades in a table, as follows: 


COURSE 


DATA ON GRADES IN VARIOUS COURSES 


Notice the rows are identified by the initials of 


particular whereas the columns are 
subj ects/ courses 
identified with particular : j 
subjects/ courses 
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162. 


163. 


164. 


165. 


166. 


The word "course" is written at the top of the table 

and might be thought of as the name of a variable 

whose particular values correspond to the particular 
in the table. You could think of 


Columns rows 


uv) IE qəs E Ce = bələ AD 
as particular values of the " course" variable. 


We identify each student by an initial Thus, if we 
think of students as a variable, the particular values of 


that variable are , ; , and 


You would find the course in which the students did most 


poorly by comparing different . You would 


columns/rows 
find the student who had done the best in his courses by 


comparing different . 


rows/ columns 


In the preceding table, the variable we are interested 

in is "grade." The different values of that variable are 

A, B, C, and D, The data presented in this table are 
observed values of the grade variable. 


12/4 


Earlier we considered a way in which you might keep 
track of your improvement as you took bowling lessons. 
After each lesson, you could bowl 5 balls and record 
the number of pins you knock down with each ball For 
the purposes of this test, you always set up any pins 
you knock down before you bowl the next ball. There- 
fore, the fewest pins you could knock down with any one 
ballare _ —— „while the greatest number of pins you 


could knock down with one ball is A 
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columns 


Math, English, Gym 


AM, U.T., AC, 
RK. 
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167. 


168. 


169. 


Suppose you took 4 lessons and bowled 5 of these 
test balls after each lesson. Imagine you recorded 
the data from these tests in the following table. 


DATA FROM TESTS OF IMPROVEMENT 


Since your data are observed values of the "pins- 
knocked-dovvn" variable, each value is identified with 
a particular ball you rolled following a particular 
lesson. In this table, the 5 test balls you rolled are 


associated with 5 different while the 
rows/columns 


4 lessons that you took are associated with 4 particular 


rows/ columns 


The total number of pins knocked down following the 


first lesson was ğ 
6/10 


The total number of pins knocked down following the 


second lesson was . The most pins were 


knocked down following lesson , Since there 
were pins knocked down with the 5 balls rolled 


on that day. 


42 


rows 


columns 


10 


35 


Ley 


SO jo SO ueyeq SPd 
uory1ododq Jo 1equinN Jo 1equinN 


porad Áep ont € 1eAo Aəyuo// 
Te3ueuredxs ugang € yo 1orAegoeg BUET 


-skep 001 2894} jo yova uo uəyeə 

zequinu SA pinoə Aeyuou ay} s3ərrəd yo Ə) SISH 
9148} SUTMOT[OF ən) jo uumroo 1810 eu “Sşərrəd ç pue 

0 uee^jeq e1eu^euros uojeo peu eu ‘Shep 001 Snorao1d 

əm) JO uəsə ul :1equieugo Sunsəş [ejueurriedxo am ur Aep 

uəvə uru 0} 9[qe]IeA? əpeu ITIM YOTYM ‘poo; yo sjorred 

eS1v[ əye reurtue [ejuourtrodxo 1no£ əsoddns s), şərr 


*JorA?Qeq Suryee snorAədd sty Jo sp1ooor rno Á 
0} pərrəyər noÁ “Xep euo uf mea o) Aəsuour a oy pooy yo 
junoure Tensnun ue aq p[noA yeym oururrojep 01 IƏPIO ul 


*stsoyjodAy Əv? 


32efez1 /ydeooe 


şəələr prnom noA yey} *enij ərəm stsoujodAq om Jt 
quəAə [ensnun ue qəns eq p[noA uorjeA1esqo ST) IƏYJƏYM 

əprəəp uəu) pue uopjeA1esqo ue age Tt noÁ “srsəuyodÁu 

ay} 3dəəəc o1 JOU zo zəuşəqA ƏpI9əp o) I9pIlO UI :SAO[[OI 

se pəztreurums aq prno? Aöəşens Suryeur-uolsioəp Jnox 


"027 


"611 


170. 


TL 


172. 


173. 


We have now looked at several tables and can 
summarize certain things about tables. First of 

all, the definition of a table is "any arrangement of 
things in and ." When the 
things you are arranging in rows and columns are data, 
İhe rov/ and column headings help you to identify each 
particular piece of data, but they part of the 


are/ are not 


data. 


It is often useful to think of the rov and column headings 


as particular values of a 5 


One of the first things you should learn to look at when 
you see a table of data are the row and column headings. 
These headings identify where the observed values 


that make up the were obtained. 


For example, the table shown below contains data from 
two different subjects: subject ' and subject 
These two subjects were observed on four different 


s = 
ge 
KS ra 
cd 


EET 
D MN 


43 


rows, columns 


are not 


variable 


data 


A, B 


days 
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174. The data in the table belov is different from 
the same as/ different from 
the preceding table, since the data below was obtained 
from four on two different * subjects, days 


: DAYS 
SUBJECTS 


5500 
mz 
Roue | 
Er aa 
mien 


This illustrates how carefully you should examine the 


row and column headings on a table. Many errors in 
interpreting data can be traced to a simple failure to 
pay sufficient attention to the headings on the table. 


175. You have seen that one way of presenting or recording 
a list of values for some variable is to arrange the 


names of these values in and rows, columns 


to form a table. 


176. Next, we will consider a way of representing values of 
a variable other than simply arranging the names of 
the values in rows and columns to form a ` table 


177. H you represented the numeral 1 with a square, you 
could represent the numeral 2 by putting another square 
on top of it. Thus, drawing shown below would A 
B 
represent numeral 1, whereas drawing would B 
: AJB 


represent numeral 2. 
A 
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178. 


179. 


By adding a third square on top, you could represent 
the numeral 3. In fact, you could keep adding 

squares, one on top of the other, for each new numeral 
you wished to represent. For example, of the two 
pictures shown below, Picture A would represent the 
numeral 3, whereas Picture B would represent the 
numeral . 


A B 


You can think of these squares placed atop each other 

as forming columns, and you could place these columns 
side by side as shown below. Each column represents a 
numeral in the same way as the columns of squares you 
just considered. In the picture shown below, we have 
identified each column with a letter placed directly 
underneath the column, Column A represents the 
numeral "3" since this column is three squares high. 
Column B represents the numeral" ^ " since this 
column is four squares high, and column C represents 


the numeral" 5 
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179. (Continued) 


The width of each column is the same, but the height 
differs depending upon vhich numeral the column 
represents. In other words, the of the height 


width/ height 


column determines which numeral it represents, 


180. The same three columns are shown below. Now, how- 
ever, we have drawn a line with marks on it to represent 
the different possible heights of the column, For 
example, the mark next to the numeral 1 is at the 
same height as a column one square high. The mark 
next to the numeral 2 is at the same height as a column 
two squares high, and the mark next to the numeral 3 
is as high as a column ` squares high. 3 


a 


181. Notice Column A is 3 squares high. This is indicated 
by the fact that it is the same height as the mark next 


to the numeral ü 
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182. 


183. 


184. 


Although Column B is not touching the line with the 

marks on it, you can see that Column B is 4 squares 

high and, therefore, at the same height as the mark 

next to which the numeral yis written. 4 


The mark next to the numeral 2 is at the same height 


as the top of Column since that column is two E 
A/ B/C 


squares high, 


Suppose you were tossing a die (one of two dice). You 
might be interested in the number of dots shown on its 
top when it came to rest. For example, the folloving 
die has dots showing on its top surface. 2 


The number of dots appearing on the top of the die after 

each new toss be a variable since this would 
"would/ would not 

number could change or vary from toss to toss. The 

possible values of this variable could be represented 


by the numerals" 5, mm 4, m" 02; mr 3, mm 1, " and" Ball 6 


We could represent each possible number of dots 
showing on the top of the die by a column of squares. 
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185. 


186. 


187. 


(Continued) 


For example, we could represent the die falling so 


that 3 dots were on top by column shovn belov: 
A/B 


4 
3 


A B 


Since the largest number of dots we could observe on 

the top of the die would be , the highest possible 
column of squares we would need would be squares 
high. 


Since the fewest dots we could observe on the top of 
the die would be a single dot, we could represent this 


outcome by a column ` square high. 
Suppose you tossed a die four times and recorded the 


number of dots showing after each toss, your results 
might be like those shown in the following table. 
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189. 


190. 


(Continued) 


On the first toss, there were 5 dots showing, on the 
second toss there vere 3 dots showing, on the third 
toss there vere — dots showing, and on the fourth 6 
toss there was ` ` dot showing. 1 


Notice how the data in the previous table are represented 
by arai of numerals indicating the different column 


column/row 


observed values of the variable. 


The column heading "dots showing" can be thought of 
as the name of the variable we are observing and the 
numerals in that column as the observed 


observed/possible 


values of that variable. 


Notice that we have used numerals to represent the 
different observed values of the variable. These 
numerals are simply names for the values. Another 
way of representing these values is with columns of 
squares that form a sort of picture of the data. 


Four columns are shown on the following page. Each 
column represents one of the observed values in the 
previous table. We have identified the particular toss 
represented by each column by writing its name 


directly below that column. 
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(Continued) 


Im SS is 6 C. 
he column for the to 1 blocks high since 


6 dots were showing on the toss. 
6 


5 


DOTS 
SHOWING 


d wa 3-4 
TOSS 


Since the line next to the column on the left has marks 
to indicate the different possible heights of the columns, 
we really don' t need the lines between each square in 


the column. 


We have redrawn the columns below, leaving out the 


lines separating each square in a particular column. 


6 


1227209 3 
TOSS 


We can still see that the third column represents the 
numeral 6 since the third column is the same height as 


the mark next to the numeral 
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192. 


193. 


194. 


A figure of this sort vhere the values of a variable are 

represented by the height of each column is a İype of 

graph. In this graph, the value of each variable is 

represented by the of each column height 
rather than by a name or numeral, as in the previous 

table. 


When you roll a die, there will be a certain number of 

dots showing on its top face when the die comes to rest. 

For example, the die shown below has dots 6 
showing on its top face. 


Suppose you tossed a die four times and recorded the 


observations shown in the following table: 


According to this data, the fewest number of dots were 


third 
showing on the toss. 
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196. 


Two graphs are shown below. 


5 
4 
DOTS 
SHOWING 3 
1 
Ist 2nd 3rd 4th 
TOSS 
GRAPH A 
5 
4 
DOTS 
SHOWING 5 
1 
0 
Ist 2nd 3rd 4th 
TOSS 
GRAPH B 


In the previous table it was indicated that there were 

3 dots showing on the first toss. Both Graph A and 
Graph B indicate there vere —^ — dots showing on the 
first toss because the column representing the first 


toss in both graphs is squares high. 


Compare the data shown in each of the previous graphs 


with the data shown in the previous table. Graph 
A/B 


A/B 


agrees perfectly with the table, whereas graph 


does not. 
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197. 


198. 


(Continued) 


Both the table and Graph B indicate one dot was showing 
on the third toss, whereas Graph A indicates two 
dots were showing on the third toss. 


Notice how easy it is to compare the values in the two 
graphs because the values are shown as a picture, rather 
than as numerals in a table. Because it shows a picture 
of the values rather than simply listing their names, a 
is useful, graph 


graph/ table 


Earlier, you considered an experiment in which we 
recorded the time it took a rat to run down an alley to 
reach food. The data for that experiment were observed 
values of the "running-tim€" variable. Let's consider 
how we could represent values of the running-time 
variable in the form of a graph. Four values we might 
have observed in that experiment are shown below in the 


graph/ table 


RUNNING TIME 

[ 2 | aseene 
TEE 
ban 


form of a 


10 Seconds 
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199. 


200. 


201. 


The longest observed running time was seconds. 
The shortest observed running time was seconds. 


The graph shown below represents the same data as the 
previous table. Notice that we have once again 
represented a value of the variable by the height of each 
column. The column representing Day 1 is the same 
height as the mark next to the numeral 30. The column 
representing Day 2 is the same height as the mark next 
to the numeral . At the left of the graph we 
have written what the numerals represent — that is, 
"running time" in seconds. Since the column for Day 1 
is next to a mark with the numeral 30, you know that 
the rat took ^ ^ seconds to run down the alley to 
reach the food on the first day. 


40 
RUNNING 30 
TIME IN 
20 
SEDONDS 
10 


On Day 4, the rat took 5 seconds to reach the food (as 
was indicated in the previous table) The column for 
Day 4 is only half as high as the mark next to the 

numeral 10. In other words, it represents a running 


speed of seconds. 
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202. A table and a graph are shown below. The figure 
marked A is the and the figure 


graph/ table 


marked B is the : 
graph/ table 


10 
BE 
BEBE ps E 
BE s caso 
FIGURE A ] 
5 
SCORE 4 
3 
2 
1 
0 
şirəyə 
DAY 
FIGURE B 
203. The numerals along the side of the graph are different 
values of the variable, 
"day" /" score" 
204. The largest observed " score" was on Day S 
10/15 2/3 


Notice how clearly this is shown by the graph: Day 


2/3 


has the highest column. 


table 


graph 


"score" 
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206. 


Consider the folloving graph and table. The "score" 


value shown for Day 1 the same on the graph 
is/ is not 


as on the table. 10 


aI co € 


DAY 


A score of is shown for Day 2, both on the graph 


5/8 
and on the table, 


The observed value for Day 3 is indicated in the 


but not in the S 
graph/ table graph/ table 


To make the table identical with the graph, you should 


put a score of in place of the question mark shown 


in the table. 


It is clear that the scores were the same on Day 2 and 3 
because the columns are the same for 


both days on the graph. 
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208. 


Another graph and table are shown below. Fill in the 
table so that it is identical to the data shovn on the 
graph. Now compare your work with the table given as 


the ansver. 10 


SCORE 


0 


DAY 


The data in the following graph are values of a variable 
." A value of this variable 


named " 


was observed on each of three ğ 
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(Continued) 


The largest cost was observed on Day , since the 
column for that day is than any other 
column. 


To fill in the column and row headings in the following 
table so that it represented the same things as the graph 
we just considered, you would label the column 
containing the values with the word" " and 


the three rows with the numerals" ét Eé St 


" " 


Since each row represents a different on which 
a cost was observed, the square above the numbers 1, 
2, and 3 should have the word" " written in it. 


Fill in the table below so that it represents the same 


things as the graph does. 
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212. What if you had a score of zero and you wished to 
represent it in a graph. If a column three squares 
high represented a score of 3, and a column two 
squares high represented a score of 2, and a column 1 


square high represented a score of 1, then a column no 


squares high would represent a score of » zero (0) 
213. Therefore, on the following graph a score of zero was 
recorded on Day 2 and Day . 4 
4 
3 
SCORE 2 
1 
0 
dc 20. EC 
DAY 
214. Notice the line on the left of the graph has a mark to 


indicate the height of each column and a numeral 
indicating the "score" value of each height. The 
numeral indicates that if a column had nür 


no height at all it would represent a score of 5 0 
215. Often you will see graphs in which the top of a column 


is at a height betw een two of the numerals or marks 


indicating values. 
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215. (Continued) 


For example, the following column would probably 
indicate the value because it is lower than 


20/30/40 
40 but higher than 20, and approximately half-way 
between their marks. 


40 
SCORE 20 
0 
DAY 
216 According to the following graph, the score on Day 1 
was and the score on Day 2 was . 
25/ 50 15/100 
100 
SCORE 50 
WS SWEET 
DAY 
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217. 


218. 


m the same way, you could represent the value 12 with 
a column one and one-half squares high. For example, 
Column A would represent a score of 12 and Column B 


would represent a score of because it is 
2/24% 
between the height of 2 squares and the height of 


3 squares. 


Suppose you were recording how much time it took a 
person to solve a puzzle. You might have collected the 
data shown in the table below: 


SUBJECT TIME TO SOLVE 
1 20 Seconds 


15 Seconds 
25 Seconds 


Thus, there is data from subjects, and each 
; 20/4 


observed value is a particular 7 
time/ subject 
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Section II: Distributions 


We have referred to records of the observed values of 
a variable as data. 


is often useful to summarize or describe data, rather 
than simply listing all the observed which values 
make up the data. 


Suppose there were eight possible values of a variable 
and you observed only three of these possible values. 
Your data be summarized by the could not 


could/ could not 
statement: All the possible values of the variable were 


observed. 


A/B 
which might be summarized or characterized by the 
statement: The value of the variable was the same for 


Of the following two tables, Table contains data B 
A/B. 


all observations. 


Table B indicates the value of the "score" variable vas 


seconds on each of the days. 20, 3 
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You could describe or characterize the data in 


Table oR above with the statement: A different value A 


was recorded each time the "grade" variable was observed. 


Suppose the grades shown in the previous tables were 
based on the usual grading system of A, B, C, D, and 
F, where A is the "best" grade and F is the "worst." 


You could describe the data shown in Table with B 
A/B 


the statement: An A was the best grade observed. 


You could describe the data in Table(s) Aand B 
A/B/A and B 


by saying the "worst" grade observed was a D. 


The reason we say these statements characterize or 

summarize the data is that they tell us something, but 

not everything, about the data. In the previous example, 

we could describe TE saying: Only two of B 


the possible values of the "grade" variable were 
observed. This statement describes how many of the 
possible values are represented in the data. The 


statement tellus which particular does not 
does/ does not 


values were observed. 
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9. 


10. 


Consider the table of data shown below. 


Breakfast 


Lunch 


Dinner 


The data consist of three values of a variable named 
"cost" /" meal" 


You could characterize (describe) the data shown in the 
preceding table with the statement: The largest observed 
value of the cost variable was $ ; 


Saying $4. 10 was the largest observed value, 


“does/ does not 


tell you which particular meal cost the most. 


If you were only interested in the smallest amount paid 
for any one meal, the statement "$1. 50 was the smallest 


amount paid for any one mean" describe 


would/ would not 


the data in sufficient detail for your interests. 


Therefore, if you were only interested in a particular 
characteristic of the data, a summarizing statement 
often answers your question most simply. However, a 


summarizing statement tell you as much 


does/ does not 


about the data as a complete list or table of data. 
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11. 


12. 


13. 


One of the first things you might ask about a collection 
of data is how often were particular values of the 
variable recorded. For example, the following table 
indicates a "score" of"8" was observed times 
out of the 5 observations making up the TERIS 


The score value 5 occurred the same number of times 


as the value 2, since they each occurred A 


You could characterize (describe) the data in the 
previous table by saying the value" " appeared 
three times and the values " " and" " each 


appeared once. 


This summary of the data be sufficient 


would/ would not 
if you were only interested in which score value 
occurred most frequently. The statement would tell 
you that the value" "gas observed more often 


than any other value. 


The previous summary statement tell 
does/ does not 

you enough about the data to determine on what 

particular day a score of "2" was observed. 


Whether or not a certain vay of summarizing the data 
is suitable depend on what particular 


does/ does not 


characteristic of the data you are interested Ii 
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14. 


15. 


Suppose you asked ten people to judge whether a 
particular painting was "good" or "bad." You might 


obtain data of the sort shown in the following table. 


Judgement 


You could summarize this table of data by counting the 
number of times each of the possible values of the 
3 


"judgment" variable were observed. 


The two possible values of the variable named 


"judgment" are and 


According to the table of data, the value "good" was 


observed times and the value "bad' was observed 


times. 
Another way of saying the value " good" occurred 6 times 
is to say the frequency of "good" was 6. Thus, instead 


of saying the value "bad" occurred 4 times, you would 


say the frequency of "bad" was 
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17. 


18. 


19. 


20. 


21. 


If you were to say the frequency of a certain value 
was ten, you would simply mean that you counted how 
often that value had occurred in the data and found it 


had occurred times. 


If you said the frequency of a certain value was 30, you 
would mean that you had counted the number of times 
that value had occurred in the data and found it had 


occurred times. 


If you had a set of data in which a particular value 
occurred times, you would say the frequency of 


that value was 25. 


If 8, 6, 5, 2, 6, and 1 vere a collection of data, you 


would say the frequency of the value 6 was and 
2/3 


the frequency of the value 8 was , Since "6" 
occurred tvice in the data, whereas "8" occurred only 


If a collection of data contained the values 8, 8, 2, 9, 
6, 6, and 5, the frequency of the value 8 would be 


and the frequency of the value six would be o 


To say a value has a frequency of zero means that 


value occurred in the data times. 


In a particular set ofdata, therefore, we might find 


the frequency of the value 20 was zero. This would 


mean the value 20 3 
never occurred/ occurred 20 times 


in the data. 
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10 


30 


25 


once 


zero (0) 


never occurred 
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22. 


23. 


24. 


25. 


26. 


Suppose a variable had three values which could be 
represented by the three letters A, B, and C. Suppose 
your data consisted of the following observed values: 
A, A, B, A, B, B, A, and A, The frequency of the 
value Ais and the frequency of the value B is . 


Since there were no observed values of the possible 
value C, the frequency of C is D 


H the data consisted of ten observations, we would have 
a list of ten observed values. If all ten of these values 
were the same, we could say the frequency of that 
value was Q 


If the data are 100 observations, it 


would/ would not 


be possible to have a frequency of some value which was 
greater than 100. 


Suppose you tossed a coin a hundred times and Tet it 
fall on one side or the other each time. The frequency 
of "heads" in your data could not possibly be greater 
than b 


The fewest number of times "heads" could occur would 
be none at all. Therefore, the smallest possible 


frequency of heads would be : 
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27. 


28. 


Whenever we count things, we obtain a number. Since 
we count the values in the data to find their frequency, 


each frequency anumber. One way of is 


‘is/is not 
summarizing or characterizing data is to count how 
often each of the possible values of the variable occur. 
You could determine a frequency of occurrence for 
each of the possible values of the variable. 


Consider the following table of data. 


"ess? 
x 


The variable represented in the data is named" — — " grade 
and a value of this variable was recorded for 10 
students. Since the grade A occurred 4 times, the 
frequency of the value Ais  - 4 
The frequency of B grades is , and the frequency 
of C grades is because no C' s were recorded. h 
D and F both occurred just once. Thus the frequency 

1 


of D is the same as the frequency of F and equals 
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29. 


30. 


81. 


32. 


33. 


We could summarize this frequency information about 
the data in the previous table as follows: 


Possible Values Frequeney 
Of Grades 


A 
B 
C 
D 
F 


The data in this table is represented by the numbers 


4, 4, 0, 1, 1. These numbers represent 
do/ do not 


values of the" grade" variable; they do, however, 
represent the of times each possible 
value was observed (occurred in the data). 


Because grade A occurred four times, the numeral 
4 occurs in the same rov of the table as grade e 


Because grade B occurred four times, the numeral 
occurs in the same rov as grade B. Since a 
grade of was never observed, a occurs 


next to that letter. 


The last two rows in the frequency column contain 1" s, 
since both grade and grade were 


observed only once, 


The table in Frame 27, which contained a grade 
student, is often referred to as a table of the 


means that ve have not 
we have merely listed 


for each 
raw data. The word "raw" 
summarized the data in any Way; 


allthe observations. The data are represented in the 
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33. 


34. 


35. 


36. 


37. 


38. 


(Continued) 


table in Frame 29, by the frequency of each 
value. This table be considered a 


vrould/ would not 


table of raw data. 


I you added together the frequencies of the possible 
values shown in the table in Frame 29, you would find 
that the sum or total of these frequencies equal 5 


The total of all the frequencies in a frequency table 


equal to the total number of observations in 
is/ is not 


the corresponding table of raw data. 


This is what we would expect, of course, since each 
value in the table of raw data contributes to only one of 
the frequencies in the frequency table. For example, 
the four observed grades of A in the table of raw 

data were only counted when the frequency for grade 


was being determined. 


If a coin were tossed a hundred times and you were 

told the observed frequency of "heads" was ninety-nine, 
you would know that the frequency of "tails" was 
because the frequency of "heads" plus the frequency of 


"tails" must equal ` 


If your table of raw data contained 1, 000 observations 
and a particular value had a frequency of 1, 000, you 
would know all the other possible values of the 


variable had frequencies of * 
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40. 


41. 


42. 


43. 


Summarizing or characterizing data in the form 

of a frequency table is suitable if you are only 
interested in the of times each value 
occurred in the data. H you were interested in the 
sequence or order in which each value was observed, a 


table of raw data be suitable for your 


would/ would not 


purposes. 


Whether a frequency table or a table of raw data is 


required depend upon what particular 
does/ does not 


aspect of the data you are interested in, Each of the 
frequencies in the frequency table is a kind of summary 
of your data obtained by counting how often each value 
occurred, We think of this frequency as a 


can/ cannot 


number that describes or summarizes the data. 


Any number or term that summarizes or describes a 


collection of data is called a statistic. Each of the 


frequencies, therefore, would be called a 3 


Frequencies are often called enumerative statistics, 
because the word enumerate means to count and because 
we the number of times a value occurs 
in the data in order to determine its frequency. Thus, 
we refer to frequencies as enumerative statistics 


because the word "enumerate" means to ğ 


Each of the frequencies in a frequency table is a 


number which summarizes the data. Each of these 


frequencies is called an statistic 


eS 
because the word " enumerate' means to count. 
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44. 


45. 


Suppose you were conducting the following experiment. 
You present a subject with one pair of tones and ask 
him which of the two tones appeared louder — the first 
tone or the second tone. Suppose you had presented 
the subject with ten pairs of tones and asked him make 
a response after each presentation. You might have 
recorded the ten responses in the following table of 


. 


Tone Pair Answer 


1 
2 
1 
1 
2 
1 
1 
2 
1 
1 


In this table, an answer of" 1" represents a response 
indicating that the first of the two tones was the louder, 


whereas the answer "2" represents a response 


= 
cO e o - € oc ^ c DY 


indicating that the second of the two tones was the 
The data in this table could be called, there- 
data. 


louder. 
fore, 
numerical/ non-numerical 


The two possible values of the "answer" variable in 


the previous table are s "hands ", The number 


represented in this table is 5 


of observations 
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rav data 


numerical 


12 
10 
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46. The previous table of raw data could be summarized 
in the folloving table. frequency 


7 
3 


This frequency table contains the two enumerative 


statistics " "and" S" məs 


47. Since the total of the frequencies in the frequency table 
must equal the total number of observations in the table 
of raw data, we did not have to count the number of 
times answer 2 occurred if we knew how many times 
answer 1 had occurred. For example, if the data in 
the table of raw data had been different and the 
frequency of the answer 1 had been 6, we would have 
known immediately that the frequency of the answer 2 
was , since 10 minus 6 equals 4. 4 

48. We have referred to a table containing a list of each 
observed value as a table of raw data. A table listing 
the frequency of occurrance of each value is called 


a table. Írequency 
49. - Each frequency in a frequency table is the number of 
times a particular has been recorded in value 
the data. 
50. Since the frequency of each value is determined by 
number 


counting, each frequency isa 
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A statistic is a term or number which describes some 
characteristic of a collection of data. Thus, each 


frequency in a frequency table is a . 


We refer to the frequency of a particular value as an 
enumerative statistic because the word " enumerate" 


means to " 


Because it lists the value of each observation, the 
following table could be called a table of 

data, In other words, the table presents a complete 
list of the data. 


Observations Values 


When the observed data contains only two different 


values, we be sure there are only two 


can/ cannot 


possible values of the variable. 


The following table could be used as a 


table for the previous table of raw data by filling in the 
e row of 


frequencies for each value in the appropriat 


column . 
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(Continued) 


The frequency of the value A in the table in Frame 53 is 


. The frequency of the value B in the same 4 
table is . Notice how the total of the frequencies 
in the frequency table the total number of is 
is/ is not 


observations in the table of raw data. 


Suppose the variable represented in the table of raw 
data in Frame 53 had four possible values: A B, C, 


and D. Since there were observations of the no 
values C and D, the frequencies of C and D would both 
be $ 0 


A table of raw data and two frequency tables labeled 
Table A and Table B are shown below. The frequency 
table that corresponds to the table of raw data is 
Table ` 

“AB 


TABLE B 


1 
2 
3 
4 
5 
6 
ü 
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Notice that the value does not appear in the 


preceding table of raw data and that its frequency 


presented in Table B, 
is/is not 


Table A could not represent the data in the table of raw 
data because the value 10 is represented as having a 
frequency of 3, whereas it should have a frequency of 

, as it does in Table B. Any value of a variable 
not observed (recorded) in the table of raw data has a 


frequency of S 


If there were 8 observations in the table of raw data, 

we would know that the frequency of the value 15 was A 
so long as we knew that the frequency of the value 5 

was four and that the frequency of value 10 was four. 


Every possible value of a variable has some frequency 
— whether or not it is recorded in the data — since a 
variable has a frequeney of if it is not recorded 


in the data. 


m summary, ve can saya table of raw data lists the 
of each observation, whereas a frequency 


table lists the of times each value 


occurred in the table of raw data. 


Each of the entries in a frequency table is a number 


and each of these numerals is called an 


statistic. 


The total (sum) of these enumerative statistics equals 


the total number of observations in the table of ravv 
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65. 


66. 


A collection of frequencies in a frequency table is 
called a frequency distribution of that variable. A 
distribution indicates, therefore, frequency 


how often each value of a variable occurred in the data. 


In other words, the collection of enumerative statistics 
in a frequency table indicating how often each value of 


a variable occurred is called a f frequency 
d 3 distribution 


Suppose you asked a subject to sort a collection of ten 
drawings into four boxes, each box labeled with one of 
the four adjectives excellent, good, fair, and poor. 
The subject would be distributing the ten drawings 
amoung the four possible judgments he could make 
concerning the merit of each drawing. If you recorded 
his performance, your table of raw data would contain 


observations of a variable called "judgment," 10 
10/4 


This variable has possible values. 4 


When the subject was finished distributing the ten 
drawings among the four boxes, each box would have 
some particular number of drawings init The number 
of drawings in each box could be thought of as the 
of each value of the "judgment" frequency 
variable. If the subject put five drawings in the box 
labeled "good" and five drawings in the box labeled 
"fair," the frequency of "good" judgments would be the 
same as the frequency of "fair" judgments, and would 


equal . 
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(Continued) 


The frequency of " excellent" judgments and of "poor" 
judgments would both equal E 


The numerals 5, 5, 0, and 0 be called 
could/ could not 


a collection of enumerative statistics describing the 


subject! s judgments. 


If be appropriate to say the numerals 
would/ would not 


5, 5, 0, and 0 in the frequency table define a frequency 
distribution. 


It is often useful to present enumerative statistics by 
means of a graph. We could represent the frequency 
of each value with the height of a column, just as we 
represented the of a particular 
observation by the height of a column in earlier graphs. 


Of the two types of graphs shown below, Graph 
A/B 


would be appropriate for showing the frequency 
distribution of a variable named "score," 
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If there vere only six possible values of the "score" 
variable, you would know that 6 columns would be 
required in the frequency distribution graph in order 
to represent all six values. However, since there 
were only four observations, the largest possible 


frequency for any score would be 
4/6 


The following frequency distribution graph contains 
two columns. Score A has a frequency of and 
Score B has a frequency of (as indicated by the 
heights of column A and B respectively). 


20 
Frequency 10 
0 

A B 


Score 


The table of raw data for the previous frequency 
distribution graph would have contained a total of 


observations. 


20/30 


The number of observations represented by the 
following frequency distribution graph is , Since 
A occurred times, B occurred times 


and C occurred times. 


Frequency 5 
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You know that the most frequently occurring grade in 
the previous frequency distribution is C, because this 
grade has the column, 


You can think of each column in the frequency 
distribution graph as composed of a series of blocks — 
one block for each observation of that particular value, 
For example, consider the graph shown below. 


Frequency 


e = t c A 


A... XB 
Value 


This graph indicates a frequency of for Grade A 
and a frequency of for Grade B. Therefore, the 
column for A is four blocks high and the column for B 


is blocks high. 


Notice the number of bloeks in column A forms a 
column tvice as high as column B, indicating that the 


frequency of A is as great as the 


frequency Of B. 


Enumerative or frequency statistics are very important 
in elections. You are undoubtedly familiar with 
interpreting the data from elections even if you had not 
thought of these data as statistics, To illustrate this 
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(Continued) 


point, suppose that you were conducting an election in 
which there were three candidates: R. K., A C. and 
R. A, We could think of each personf s vote as an 
observation having one of three possible values. These 
values would be a vote for 7 “or 


V/e could summarize the data from this election in a 
table such as the one shown below. 


Considering "vote" as a variable, this table vould be a 
frequency table indicating that the value "R.K." had a 
frequency of ^ , the value "A, C." had a frequency 
of  , and the value "R. A." had a frequency OE 


The number of votes cast for candidate R. K. is 
represented by the number 25 and the number of votes 
cast for candidate A, C. by the number 75. Both of 
these numbers are en statistics. 
The group of three numbers, 25, 75, and 50, are the 
distribution of the 150 votes among the three 
candidates. In other words, these numbers define a 
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This frequency distribution could also be represented 
by a graph. The graph shown below 


would/ would not 


represent the frequency distribution shown in the 


previous frequency table. 


Frequency 25 


R.K. AC. RA 


Notice the total number of voters in the election could 


be determined simply by adding all the 


in the frequency table. 


It is clear, therefore, that people voted in the 


election. 


The winner of the election is indicated by the candidate 
who has the column in the frequeney 
distribution graph. This was candidate named o 


who received votes. 


Suppose you were a psychologist interested in studying 


how accurately a person could throw darts at the dart 


board shown below: 
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(Continued) 


You might conduct the folloving experiment. The 
subject would be asked to stand ten feet away from the 
target and attempt to hit the innermost ring. Suppose 
you told him that he would be scored as follows: 3 

for the inner ring, 2 for the middle ring, 1 for the 
outer ring, and 0 for a complete miss. You would 
then ask him to toss the dart at the board twenty times, 
recording his score for each toss. 


Tn this experiment, the distance of the subject from the 


board would be a whereas the score constant 
constant/ variable 


the subject received for each toss-would be a 


° variable 
constant/ variable 
Suppose you obtained the data shown in the following 
table: 
SCORE SCORE 
T 0 1 
2 1 2 
3 1 2 
4 2 2 
5 1 3 
6 0 1 
7 2 3 
8 1 2 
9 3 3 
10 2 3 
This table indicates that the subject miss did 
did/ did not 
the target completely on his first toss. The first bulls- 
9th 


eye he made was on the toss. 
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(Continued) 


The subject completely missed the target 
times out of the twenty tosses. 


To form a frequency distribution, we must count the 
number of times each of the score 


variable occurred in the data. 


We already noted that the value 0 occurred only twice 
in the data. Therefore, the frequency of a score of 0 
would be ° 


The frequency of a score of 1 is . "The 
frequency of 2 is and the frequeney of 3 is D 


This group of statistics defines 


a frequency of the variable called 


"score," 


Suppose the data for twenty tosses indicated the 
frequency of 1" s was 5 and the frequency of 2" s vas 15. 
The frequencies of both 0's and 3! s must therefore be 


A graph of the is shown 
raw data/ frequency distribution 


below: 
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Graphing the raw data in this way reveals an 
interesting characteristic of the subject! s performance. 


In general, the more tosses he took, the 


better / worse 


was his performance. 


His performance appears to have improved as he 
continued his tosses because the columns tend to be 
towards the right-hand side of the graph. 


higher/ lower 


A table and a are shown below. The 
table contains the frequency distribution we just 


considered. This same frequency distribution 
is/is not 


represented by the graph. 


SCORE FREQUENCY 
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The most frequently occurring score is immediately 
obvious, since the highest column in the frequency 


distribution graph is for a score of ; 


While the frequency distribution graph clearly indicates 


the frequency of each score value, it 
does/ does not 


indicate the gradual improvement in performance during 


the course of the experiment. 


A graph of would be the 
raw data/ frequency distribution 


best way to indicate how the subject’ s performance 
improved during the course of the experiment, whereas 


a graph of is the 


raw data/ frequency distribution 


simplest way of showing how the subject! s tosses were 


distributed among the different scores. 


In the previous experiment, one toss of the dart might 
actually be closer to the center of the target than 
another and yet receive the same score. For example, 


consider the target shown below: 


If the X marks labeled "A" and" B' represent two 

places where a dart could have hit the target, the point 

labeled would represent a more accurate toss 
A/ B 


of the dart. 
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(Continued) 


H all darts falling within the center circle of the 
target receive a score of 3, the dart hitting at Point A 
would receive a score of and the dart hitting 


at Point B would also receive a score of . 


Therefore, although one dart was actually thrown more 
accurately than another, it received the same score. 

In evaluating the subject! s performance, it is easier to 
group all tosses of the dart that hit within the center 
circle into the same category of accuracy. In a similar 
manner, we do not consider exactly how far away from 
the target a dart was if it missed the target completely, 


since all such darts received a score of 


Another way we might assign a score to each toss of 
a dart would be to measure exactly how far from 
the center of the target each dart struck. If we 


followed this procedure, there be four 


would/ would not 


possible values of the "score" variable. 


The number of possible values of the "score" variable 
would depend upon how precisely we wanted to measure 
the distance between the dart and the actual center of 


the target. 


If we measured this distance to the closest one-thousandth 


of an inch, there would be possible values 


more/fewer 


of the score variable than if we measured down to the 
closest one-hundredth of an inch The more precisely 
we measured the distance, the more possible values 


of the score variable there would be. 
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100. 


101. 


The more precisely we measured the exact distance 
between the center of the target and the place where 


the dart struck, the often would we find two 


more/ less 
tosses of exactly the same score. If we measured 
closely enough, in fact, we would probably find that the 
subject never received exactly the same score on any 
two tosses. 


It is often useful to consider two observations that are 
really slightly different as having the same value. In 
other words, to group together observations that are 
sufficiently similar and to consider them as having the 
same value is often a useful procedure. The following 
illustration will indicate how we can group observations 


in this way. 


Suppose you were studying how long it took a person to 
solve a certain mathematical problem. 1f you made 
observations on twenty people, recording the time it 
took each person to solve the problem, your data might 
be represented by the following table: 


Time in Time in 
Subject Seconds Subject Seconds 
1 60 11 20 
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(Continued) 


Notice there a value of the time variable 
is/is not 


which occurs more than once in the table of data. (You 
might wish to insert a bookmark indicating this page 
Since we will refer to the table shown above in the next 
Several frames. ) 


Since every observed value of the data occurs once and 
only once, each of these observed values would have a 


frequency of d 


Notice that the highest value (the longest time) 
recorded was seconds, whereas the lowest 


value recorded was Seconds. 


Suppose we counted all the "times" of 50 seconds or 
faster. We would find that exactly five of the observed 
times fell into this group of times. This group of times 
would consist of the 32 seconds observed for Subject 7, 
the 16 seconds observed for Subject 9, the 20 seconds 
observed for Subject 11, the 15 seconds observed for 
Subject 18, and the seconds observed for 
Subject 5 


We could say the frequency of "times" between zero and 
50 seconds was five, since there are exactly five 
observed times that were less than or equal to 


seconds. 
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107. 


Suppose we counted the "times" between 51 seconds 
and 100 seconds (including times of 51 seconds or 
100 seconds). The time recorded for Subject 1 


be counted, since this time is between 


vrould/ would not 


51 seconds and 100 seconds. 


There are exactly observed times that fall 
between 51 and 100 seconds in the previous table of 
data. The next group of times that we will consider are 
those times between 101 and 150 seconds, including any 
that might be 101 or 150 seconds. Subject 2" s time of 
128 seconds belong to the group of times 
“does/ does not 
between 101 and 150 seconds. The other times which 
fall into this group are the time of 149 seconds recorded 
for Subject 8, the time of 110 seconds recorded for 
Subject 10, the time of seconds recorded for 
Subject , and the time of 116 seconds recorded for 


Subject 16. 


We could summarize these frequencies in the following 


frequency table of grouped data. 


TIME FREQUENCY 
5 


0-50 sec. 
51-100 sec. 10 
101-150 sec. 5 


Notice the three rows correspond to the three different 


groups of times. 
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107. (Continued) 


For instance, in the first row "0-50 seconds" indicates 
that we counted all the times between 0 and 50 seconds. 


The numeral 5 in the same row but in the frequency 


column indicates that the of observed frequency (number) 
times falling within this group was . five 
108. A frequency table of this sort is called a frequency table 
of grouped data because we have determined the 
frequency for of values, rather than groups 


the frequency for particular values. Below is a graph 
of the previous frequency table of grouped data: 
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Frequency 5 
EE 


0-50 51- 100 101-150 


Time In Seconds 


This frequency distribution gives a reasonable picture 
of the distribution of times for the different subjects. 

It indicates clearly that some subjects had times less 
than 50 seconds, that about the same number had times 
greater than 100 seconds, and that the most typical 


times fell between and seconds. 51, 100 
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109. 


If you are interested in obtaining a rough picture of the 
frequency distribution, it is often useful to group the 
data in this way — that is, not distinguishing between 
values that fall within the same group. 


Just how much you can group your data and still obtain 
a sufficiently clear picture of the distribution depends 
on your particular purposes. For example, in the 
previous illustration in which subjects were throwing 
adartata target, it was sufficient to divide the board 
into three circles and assign one of three scores 
depending upon where each dart struck. Consider the 
two targets shown below. Target A is divided as the 
target was in the previous illustration. In Target B, 
however, more circles have been drawn, thereby 
dividing the target into narrower rings. 


TARGET A TARGET B 


If you assigned a score to each dart depending upon 
which ring it struck, there would be more possible 
Scores on Target . The only difference between 


Target A and Target B, however, is that you could 
distinguish the position of a dart more precisely on 
Target a 
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110. 


111. 


112. 


113. 


114. 


115. 


In a similar way, we might have divided the previous 
mathematical problem data into smaller groups. For 
example, we could have divided each of the groups of 

times into two smaller groups. Instead of considering 

all the times from 0 to 50 seconds, we would have 
considered all the times from 0 to 25 and from 26 to 50 
seconds. Instead of considering all the times from 51 to 100 
seconds as a single group, we would have formed two 
groups, 51 to 75 and 76 to 100. Finally, ve could have 
divided the group of scores from to 101, 150 
seconds into the following two groups: 101 to 125 

seconds and 126 to 150 seconds. 


If you become a scientist, one of your chief jobs will be 
to make observations of variables. Records of these 


observations are called 5 data 


We have seen that one way of presenting data is to 


arrange it in rows and columns to form a . table 


A table listing every observed value of the variable is 


often referred to as a table of data. raw 


The number of times a particular value occurs in à 


table of raw data is referred to as the frequency 
of that value. 
You determine the frequency of each value can 


can/ cannot 


for both a numerical and a non-numerical variable. 
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116, 


117. 


118. 


119. 


120. 


121, 


Instead of presenting a complete list of all of the 
observed values in your data, it is often useful to 
summarize the data in some way. Any term or number 
summarizing data in this way is called a s . 


One way we can summarize data is to report the 
frequency of occurrence for each value in the table of 
rav data. Therefore, each of these frequencies vould 


bea $ 


Since the frequency of a value is determined by counting 
(enumerating) the number of times it was observed, a 
frequency is often referred to as an 


statistic. 


Any value of the variable not observed (one that does 
not occur in the table of raw data) would have a 
frequency of . I the data consisted of 

20 observations, the greatest possible frequency of any 


value would be . 


We can summarize the contents of a table of raw data 
in a table listing the number of times each value 
occurred in the data. This summarizing table is often 


referred to asa table. 


The enumerative statistics presented in a frequency 


table indicate how our observations were distributed 
among the different possible of the 
variable we were studying, 
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122. 


123. 


124. 


125. 


126. 


The frequency distribution of a collection of data 
is the collection of frequencies for the values in that 
data. Therefore, the enumerative statistics in a 
frequency table indicate the 

of the data. 


If you were told that a coin had been tossed 100 times 
and had come up "leads" 75 times and "tails" 25 times, 
you know the frequency distribution 


would/ would not 


of the data. 


Suppose you tossed a coin 100 times and observed 50 
"heads' and 50 "tails." Instead of reporting the actual 
frequency of "heads" or "tails," you might describe the 
distribution of the variable by saying that one-half of the 
tosses were "heads" and that of the 


tosses were "tails." 


If you observed 500 "heads" and "500" tails in 1000 


tosses of a coin, you could say that one half 
also/ not 


of your observations were heads and one half of your 


observations were tails. 


When you say one half of the observations are "heads," 
you are not indicating the actual frequency of "heads" in 
the data. Instead, you are indicating what part of all 
the observations is" heads." For example, if I told 
you I tossed a coin some unknown number of times and that 
one half of the observations were "heads, " you 

know the actual frequency of heads; 


would would not 


you know, however, what part of the 


would/ would not 


data was made up of observations of "heads." 
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127, 


128. 


129, 


H you knew the frequency of a particular value was 10, 
you would not know whether observations of that value 
represented a large part of the data or a small part 

of the data. If a value had a frequency of 10, 
observations of that value would represent a larger 
part of your data if the data consisted of 


12/100 


observations rather than observations. 
12/100 


Instead of reporting the actual frequencies of each value 
in your data, it often is useful to indicate the relative 
frequency of each value. The relative frequency of a 
value indicates how often that value occurred in relation 
to the total number of observations. For example, if 
you said the value 10 occurred 30 times in the data, you 
would be indicating the actual or absolute frequency of 
the value 10. If you said one-third of the total number 
of observations had the value 10, you would be 


indicating the frequency of the value 10. 


absolute/ relative 


Adding together all of the frequencies in a frequency 
table indicates the total number of observations in the 
data. To find out what proportion or part of the total 
is represented by a particular frequency, you simply 
divide that frequency by the total number of 
observations. If your data consisted of 100 
observations and a particular value had a frequency of 
50, the proportion of your observations having the 
value 50 would equal 50 divided by ? which 


equals one-half. 
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130, H your data consisted of only 4 observations and a 
particular value had a frequency of 1, then the 
proportion of times that particular value occurred is 


divided by , or one-fourth. 1, 4 


131. If your data consisted of 1000 observations and you 
observed a particular value 250 times, the frequency 
of that value would represent one-fourth of the total 
number of observations, since 250 divided by 1000 
equals . one-fourth 

132. It is possible to represent a proportion in several 
different vays. For example, you could represent the 
proportion "one-half" either as a fraction, which would 
be written as > or as a decimal, which would be 
vritten as .5. Similarly, you could represent the 
proportion "one-tenth" as either a fraction, which 


would be written , orasa decimal, which 


` heje 
=° 


would be written t 


H your data consisted of 100 observations of a variable 

and 10 of those observations had the value 4, you could 

say the proportion of observations in your data having 

the value of 4 were (representing the 1007 ib 
proportion as a fraction) or (representing 1 

the proportion as a decimal). 


133. It is possible to convert a particular fraction into 
another form representing the same value. For 
example, 2 is the same as i. Similarly, you can 


write a decimal in several ways. For example, .2 


is the same as . . 20 
.20/.02 
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133. 
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135. 


(Continued) 


Statisticians often find it useful to represent a 


proportion in terms of hundredths. In these terms, 


one-fourth would equal Ki , Whereas one-half would 


50 Mec 
equal 100* Similarly? would equal 


hundredths. 


If a certain proportion equals 25/100, we often say that 


it equals 25 percent. Similarly, if a proportion equals 


50/100, we can say it equals fifty percent. Representing 


a proportion as a percentage is simply a 


convenient way 


of comparing proportions by converting them all into 
hundredths. For example, you could compare one-half 


and one-fourth by saying one-half equals 


percent, whereas one-fourth equals 


50/25 


percent, 
25 


(since 1/2 = 50/100 and 1/4 = 25/100.) Similarly, 


the proportion . 25 represents 25/100 or 


percent. 


H you said the frequency of a particular value in a 


collection of data represented 50 percent of the data, 


you would know the frequency of that value represented 
hundredths of the data. Thus, if the data 


consisted of 100 observations, a value would have to 


have a frequency of 0. 


50 per cent of the observations, since 50 per cent is 


in order to represent 


equivalent to 50/ 100. A value observed on five out of 
ten observations also represents 50 percent of those 


10 observations since 5/ 10 equals 


hundredths. 


99 


75 


50 


25 


25 


50 


50 


50 


Teona} 


Cp Ie gg. 


Au or 


(uu)za 


ose 


€ AZIS JO ATdWNVS V NI 
SNOINIdO WISVHOAV4 JO UAAWNAN 


ç I 0 


Geer ee * oo Q mo 
ALTIIMVHOMd 


"Mos Sty} yo uorjerndod € uros; so[dures opgi urejqo əm 
uaym 3əədxə oj ərdures yo odÁ; ərqissod yore jo uorj10doad 
TEA Sn S119} 3! Jey} ST (Aol[əq ude:3 əm) uo umous) 
uornqrrstp 3urdures peorjex09uj sty} yo o2uvgj1oduirt ou 


*e[qe10A?7 819A 000 ‘L pue epqe1oAejun ITIM 
000 “£ qərqu yo 'suorurdo , squapnys 000 ‘OT Jo 3urjsrsuoo 
uorjerndod € mot UMeIpP OM} ƏzIS Jo Səqdures 10] 


uorjnqrajstrp Surqdures g peururrojop 
ƏALY ƏM 'srsA[eue Ájt[tqeqo1d e yo sueəur Aq “əroyəsrəqL 


( sənriqeqosd yeməz 
əm) qoy əlqu) snoraa1d ay} o) 19jey) = + - 
(Au)ra + (uÁ)ad = (uorurdo o[qexoAv] əuo)ıq “nu 


s ti )zd pue ( Lid ppe pue orna 
UOPTDPPE ən) Ardde ysnur am “TƏAƏAMOQq “uorutdo o[qe10Av] 
euo 3ururejqo yo Ajtrtqeqoad ay} ure1qo 0} səpuo ul 


“60: 40 € Lid ds 


ST suorurdo əlqaz0Asyun om} 3ururejqo yo Ajrriqeqoad eur 


“881 


"381 


“987 


"G81 


WOCH 


136. 


137. 


138. 


I your data consisted of one thousand observations, of 
which 250 had the value 5, you could say that 

percent of your observations had the value 5. If 250 
out of one thousand observations had the value 5, then 
250/1000, or 25/100, of the observations would have 
the value 5. This is why you could say that 25 

p of the observations had the value 5. 


Instead of the word "percent', the symbol % is often 
written after a number. Thus, 25% means the same 


thing as 25 A 


A proportion written as a percent is often referred to 
as a percentage. Therefore, the following is a list 
of S. 


209, 30%, 5%, 8% 2% 


Remember, represents a proportion of 2o 


209/25 


While it is often useful to represent a frequency as a 
proportion or percentage, it is important to 

consider the actualfrequency. For example, suppose 
an automobile dealer told you 3/4 of the people to 
whom he had sold a particular car claimed it was the 
best car they had ever driven. If he had sold one 
thousand of these cars, the proportion 3/4 would 


represent people. On the other hand, if 
0/ 250 


he had only sold four such cars, only people 
would have told him it was the best car they had ever 


driven. 
Remember, 50% oi 1000 is , Whereas, 
50% of 100 is 


100 


25 


percent 


percent 


percentages 


20% 


750 


500 
50 
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139. 


140. 


141. 


142. 


It is easy to convert any frequency table into a table 
listing the proportion of times each value was observed. 
To do so, you simply divide each frequency in the 
frequeney table by the total number of observations in 
your data. By doing so, you convert each 

frequency to a(n) 


absolute/ relative absolute/ relative 


frequency or proportion. 


Consider the frequency table shown below: 


VALUE PROPORTION 
red 
green 
blue 


The data represented in this table consist of 


observations of a variable, 


numerical/ non-numerical 


In the third column of the above table, we could list 
the proportion of observations having each of the three 
values. For example, the value "red" occurred —. 
times in the observations. Therefore, the 


proportion of times Red occurred equals divided 


by . 


The proportion of times that the value "green" was 
observed equals divided by , which 
equals 3 


101 


absolute, relative 


10 
non-numerical 
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143. The of times the value "blue" proportion 
occurred in the data equals 5 divided by 10, or 1/2. 


144. 


VALUE FREQUENCY PROPORTION 


red 2/10 
green 3/10 
blue 5/10 


The numbers in the third column of this table are the 


we just calculated, proportions 


145. Each of these proportions tells what part of the total 
number of observations is represented by each value. 
If a particular value had a frequency of 0, you would 
know that none of the observations had that value; 
therefore, the proportion of observations having that 
value would equal 0 divided by the total number of 
observations, which means the proportion would equal 


146. Suppose you asked 100 people to predict which party, 
Democratic or Republican, would win the next 
Presidential election. Your data might look like that 


summarized in the frequency table shown below: 


FREQUENCY 


75 
25 


DEMOCRAT 
REPUBLICAN 
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146. (Continued) 


You could deseribe your data as consisting of 100 
observations of a variable you might call "predicted 
party, " with "Democratic" and "Republican" the two 


possible of that variable. values 
147. The proportion of people predicting Democratic 3/4 
whereas the proportion of people predicting Republican 
was 3 1/4 
148. The number 75 indicates the frequency absolute 


absolute/ relative 
of the value "Democratic" in the previous data, whereas 


ə i R 
the number + indicates this value" s relative 


ç. relative/ absolute 


frequency. 


149. Just as the frequencies 75 and 25 describe the absolute 
frequency distribution of the previous data, the 
proportions 3/4 and 1/4 describe the relative frequency 
distribution of the previous data. 


If you tossed a coin 20 times and observed 10 "heads" 
and 10 "tails," the numbers and 1/2, 1⁄2 
indicate the relative (proportional) frequency 


distribution of your observations. 
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150. 


151. 


152. ' 


Suppose you asked 1, 000 people, rather than 100 people, 
to predict who would win the election, You might have 
obtained the data represented in the following frequency 
table: 


FREQUENCY 


DEMOCRATIC 


REPUBLICAN 


In the example where we asked 100 people, 3/4 of them 
predicted "Democratic" and 1/4 predicted "Republican. " 
In the present example of 1, 000 predictions, the 
proportion of people predicting "Democratic" is 

whereas the proportion of people predicting " Republican" 


is ğ 


The important point to be made here is that even though 
İhe frequency distributions were 


absolute/ relative 


different for the two examples, the 


absolute/ relative 


frequency distributions were the same. 


Let us consider another case in which we change an 
absolute frequency distribution to a relative frequency 


distribution by converting each frequency to a 
proportion. Suppose you have the data represented in 


the following frequency table: 


VALUE FREQUENCY 
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3/4 


1/4 


absolute 


relative 
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152. 


153, 


154. 


155. 
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158. 


(Continued) 


Note that this data consists of observations of 
10/12 
a variable. 


numerical/ non-numerical 
Furthermore, the frequencies in the table are for 


data. 
grouped/ ungrouped 


The number 3 represents the number of times a value 


between and occurred in the data. 


The number of times a value between 11 and 15 was 


observed is . 


The group of values between and 
occurred more often than any other group of values 


in the data. 


The proportion of times a value was observed between 
6 and 10 is equal to divided by , which 
equals 1/2. 


The proportion of times a value between 0 and 5 vas 
observed is , whereas the proportion of 
times values between 11 and 15 were observed is d 


Therefore, the frequency distribution 
absolute/ relative 


of the previous data is represented by the 3, 5, and 2, 
whereas the frequency distribution is 


absolute/ relative 


represented by the numbers 3/10, 5/10, and 2/10. 
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10 


numerical 


grouped 


3/10 
2/10 (1/5) 


absolute 


relative 
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159. 


160. 


161. 


162. 


TABLE A 


0- 5 5 
6 - 10 3 
11 - 15 2 


TABLE B 


VALUE FREQUENCY 


VALUE FREQUENCY 


300 
500 
200 


Of the two tables shown above, the table having the same 


relative (proportional) frequency distribution as the 


data shown in Frame 152 is Table 


Notice that Table B consists of data from 


AB 


observations. Thus, each frequency would be 


converted to a proportion by dividing it by 


Expressing the frequency of a value as a proportion 


indicates what part of the total number of observations 


were of that particular value. The largest possible 
proportion you could obtain would occur when all the 


observations in your data had the same value. This 


proportion would be 3 


For example, if the frequency of a particular value 
were 1, 000 in a collection of 1, 000 observations, the 


relative frequency (proportion) of observations having 


that value equals 


divided by 


şuor 1. 


If a particular value had a frequency of 0, it would be 
represented in a relative frequency distribution by a 


proportion of : 
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163. Notice that all of the proportions in a relative frequency 
distribution must lie somewhere between the largest 


possible proportion of and the smallest possible 
proportion of A 0 
164. Each of the proportions in a relative frequency 


distribution represents the part of the total collection of 
observations having a particular value. All of the parts 

of something added together must equal the whole. 

Therefore, the sum of all the proportions in a 

proportional distribution must always equal ` . 1 


165. A little thought will indicate why the total of all of the 
proportions in a proportional distribution must equal 1. 
Each proportion is a fraction whose numerator is the 
frequency of some value and whose denominator is the 
total number of observations represented in the data. 
The sum of all these fractions would simply equal the 
sum of all the numerators over this common 
denominator. However, the sum of all the numerators 
is the sum of the frequencies in the frequency table, 


which equal the total number of does 
does/ does not 


observations. Therefore, the sum of all of the fractions 


or proportions would always equal . 1 
242543 2 
For example, > + i * Š equals — , which 


10 
equals 10” or 


166. VVe stated earlier that a statistic is any number or term 
that summarizes a collection of data. Each of the 
proportions in a proportional distribution summarizes 
something about the frequency of a particular value in 
the collection of data. Therefore, each proportion 


be called a statistic. would 
would/would not 
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167. Of the three tables shown below, two are 


frequency tables and the other is a absolute 
absolute/ relative 
frequency table. relative 
absolute/relative 
TABLE A 


Ca 


0-100 sec. 
101-200 sec. 


0-100 sec. 


101-200 sec. 


FREQUENCY 


1/10 
9/10 


0-100 sec, 
101-200 sec. 


168. Table C is a relative frequency distribution describing 
the data in Table . Aor B 
A/B/Aor B 
169. Even though the enumerative statistics in Table A 


are different from those in Table B, both of these 
absolute frequency distributions can be represented 


by the same frequency distribution. relative 
(proportional) 
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170. 


171. 


Just as you could present an absolute frequency 

distribution on a graph, it is also possible to present a 

proportional distribution in graphic form. In the 

distribution graph, we represented the frequency of 

each value by the of a column. In this same 

way, we may represent each proportion in a proportional 

distribution. 

The highest possible column that could ever occur on 

a proportional distribution graph would represent a 

proportion of . The three graphs shown 

below represent the same data shown just previously 

in tables A, B, and C. Notice that Graph C is a 
frequency distribution graph, 

just as Table C was a proportional frequency table. 
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172. 


173. 


174. 


Notice that it is difficult to compare the two absolute 
frequency distributions in Graph A and B because the 
number of observations in each collection of data was 


İhe same/ different 


different 


The difference in the size of the collection of data could 
easily obscure the similarity of the two distributions. 
This similarity is the fact that the relative 


absolute/ relative 
frequency distributions of the two groups oí time are 


the same. 


Therefore, if you wish to compare two collections of 
data when there are different numbers of observations 


in each collection, it is often useful to use a(n) 
or relative, 


absolute/ relative proportional 


frequency. 
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REVIEW I 


FILL IN THE BLANKS: 


ic 


H a value is not recorded in the data, the value has a 


frequency of 5 


A collection of frequencies ina frequency table is called 


a frequency 


Let us assume that we have two red marbles, five green 
marbles, and four blue marbles in a box. The 
proportion of red marbles is 


Let us assume that we have two groups in an experiment. 


Group A has ten brunettes and five blonds. Group B has 
twenty brunettes and ten blonds. The relative frequency 


the same/different j 


of the two groups is 


The largest possible proportion is 


The sum of the proportions in a proportional distribution 


must equal 


Totals 


EZ ə sə 
sc ra 


What is the score in the second row, fourth column? 
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zero 


distribution 


İs 


the same 
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MULTIPLE CHOICE: 


10. 


11. 


Alist of things, one underneath the other, is called a: 


a. column. 
b. row. 
c. radial. 


d. none of the above 


A list of things arranged side by side is called a: 


a. column. 
b. radial. 
Cc. row. 


d. none of the above 


H we list all the observations and do not summarize 


them, we have: 
a. finished data. 
b. theoretical distribution. 
c. raw data. 


d. none of the above 


A number, or term, that summarizes 


collection of data is called: 
a. raw data. 
b. a statistic. 
c. a theoretical distribution. 


d. none of the above 


or describes a 
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12. 


Frequencies are often called: 
a. raw data. 
b. inoperative. 
c. enumerative statistics. 


d. none of the above 


TRUE OR FALSE: 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


VVe refer to something that does not change during an 


experiment as a constant. 


Something that does change in an experiment is 


called a variable. 
The numbers one to ten are written on pieces of paper 
and placed in a hat. The number 4 is drawn from the 


hat. The number 4 is an observed value. 


Let us assume that ve have an ordinary die. On such a 
die, the number 7 is a possible value. 


The number of nickels in a cookie iar vould be an 


example of a continuous variable. 


Numerals can only be used to represent numbers. 


Only a list of observed values can be referred to as 
data. 
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The review that you have just completed can assist you in evaluating your 
progress at this point in the program. If you had no difficulty with the review, 
proceed to the next section. If you did have trouble with any of the review questions, 
return to the place in the program where this material is presented and make sure 
you understand the material before going on to the next section. 


Follow this procedure with each of the reviews in the program. 
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Section HI: Central Tendeney 


You have seen how any collection of data can be regarded 
as a distribution of values. By "distribution" ve mean 
the number of times each of the possible values has 

been 2 


Next, we shall consider some of the ways in which 
distributions differ and how these differences can be 
deseribed. Consider, for example, the two distributions 
Shown below. 


x 5 * 

z 4 = 4 

Ë 3 B 3 

Sr g 

ar 5, 
12345 9 I 3734 5 6 

VALUE VALUE 

DISTRIBUTION A DISTRIBUTION B 


You would probably say that the typical observed value 
in Distribution B is than the typical 


larger/ smaller 


value of Distribution A. 


While the two distributions have approximately the same 


shape on the graph, the center of Distribution B is at 
a value than the center of Distribution A, 


larger/ smaller 
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You could say the central tendency of Distribution A is 

different from the central tendency of Distribution B. 

In other words, if the values in one collection of data 

are generally larger than the values in another collection 

of data, you could say the distributions of the two 

collections of data have central a different 


the same/a different 
tendency. 


Consider the three collections of data listed below: 


Data A: 2, 4, 3, 12, 6, 4 
Data B: 10, 12, 14, 16, 50 
Data C: 1, 3, 2, 3, 2, 6 


Notice that the values in Data A are more similar to the 


values in Data than they are to the values in Cc 
C/B 
Data ; B 
C/B 
The distributions of Data A and Data B seem to have 
central tendencies, whereas the different 
similar/ different 
distributions of Data A and Data C have similar 


similar/ different 


central tendencies. 


If the values in two distributions were quite similar, we 


would describe the two distributions as having similar 
central tendencies 


Consider the two sets of data shown below. 
Data A: 4, 3, 6, 6, 4, 6 
Data B: 21, 23, 20, 21, 23, 20 
You could describe Data as having a larger typical B 
A/B 


value than the other collection of data. 
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(Continued) 


Suppose the two collections of data represented the ages 
Of people at two different parties. In other words, 


Data would have been collected at a children' s 
A/B 


party and Data would have been collected at a 
B 


party for young adults. 


- A person who was 5 years old would be more representative 


of people at a party corresponding to Data than 
A/B 


he would be to the data collected at the other party. 


Even though it does not occur in Data A, the number 5 is 
more representative or typical of that collection of data 
than it is of the other collection. In a similar sense, the 


number 22 is more typical of Data than it is of 
A/B 


the other collection of data. 


The two previous collections of data differ mainly in the 
magnitude or size of the recorded values. This can be 
illustrated by the two graphs of raw data shown below. 


25 25 
20 20 
B 15 P 15 
d 10 d 10 
EC 5 PN 
0 0 

KSE 0 13625531 450776 

OBSERVATION OBSERVATION 

DATA A DATA B 
Notice how the values in Data all seem to be close 
A/B 


to (cluster around) the value 22, whereas the values in 


the other data seem to cluster around the value T 
5/10 
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Suppose a student named John received grades of 75, 
70, 76, 75, 76 in a course. Another student, 
Dick, received grades of 95, 95, 96, 75, 98. A grade of 
75 would be more typical or representative of 
grades, even though both students had 


John' s/ Dick" s 


received a grade of 75 during the course. 


The reason a grade of 75 is more typical of John' s grades 
is that a score of 75 was unusual for Dick. Most of 


95/72 


Dick" s grades seem to be clustered around 


We would say, therefore, that the central tendency of 
John' s grades seems to be than the central 


higher/lower 


tendency of Dick" s grades. 


If someone asked you to characterize John’s work by 
reporting his typical grade, you would be more likely to 


answer than you would ; 
15/85 15/85 
100 
DISTRIBUTION A Ü 
æ] 
E 
E 
E o 
RO0U1234567891 11213115 
y, 100 
DISTRIBUTION 8. Ü 
m= 
E 
° 
E 


0-T'2 3 4567819111 I 


Notice how the values in Distribution À seem to 


be clustered around the value whereas the values 
5/10 


in Distribution B are clustered around the value 5 ei 
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19. 


In other words, you could describe the difference 
between the two distributions shown in the previous 
frame by pointing out that the central tendency of one 
distribution was different from that of the other. For 
example, if you said you were referring to the 
distribution whose typical or central value was 10, it 
would be clear that you were referring to Distribution 


rather than to the other distribution. 
A/B 


The characteristic of distributions we have been referring 
to as the "typical value" is generally a value near the 
""center" of the distribution. In other words, we can 
usually think of the values in a distribution as being 
clustered around (close to) a typical or central value. 
That is why we refer to this characteristic of a 


distribution as its central 


Up to this point we have not been very specific about 
what we mean by "central tendeney." We have purposely 
not done so because there is more than one way to define 
the typical value or central tendency of a distribution. 

m other vords, there more than one acceptable 


is/is not 


way of defining the central tendency of a distribution. 


If one value occurred more frequently in a distribution 
than any other value, you might consider that value the 
most common or typical value of the distribution, 
Therefore, one way of characterizing the central tendeney 


of a distribution be to report the most 
would/ would not 


frequently occurring value in that distribution. 
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Statisticians use the most frequently occurring value in 

a distribution to characterize the central tendeney of 

İhat distribution. Statisticians call this most frequent 

value the mode of the distribution. Thus, the value having 

the highest frequency in a frequency table would 


would/ would not 
be called the mode of the distribution represented in that 
frequency table. 


D 


2 
3 
4 
5 


The value occurring most frequently in this collection of 


data is , Since this value has a frequency of $ 40, 3 


The mode is the most frequently occurring value in a 
distribution, therefore, 40 is the of the mode 


distribution shown in the previous table. 


The number could be used to characterize the 40 
central tendency of the distribution shown in the previous 
table if you used the mode to characterize the central 


tendency. 


The column on a frequency distribution highest (tallest) 


graph indicates the most frequently occurring value in 


that distribution. 
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25. 


26. 


27. 


(Continued) 


The value having the highest column in the graph shown 
above is . We could characterize the central 
tendency of this distribution by reporting that its 


was 5. 


You can indicate that 5 is the most frequently occurring 
value in the previous data by saying that 5 is the modal 
value. Thus, if the mode of a distribution is 12, you 
would say 12 was the m (most frequently 


occurring) value in that distribution. 


Suppose you were considering a distribution in which the 
largest frequency was 10. Suppose, however, that more 
than one value had a frequency of 10. In this case, both 
of these values could be referred to as the mode of that 
distribution. For example, suppose you had a collection 
of data consisting of 10 observations of a variable vith 
3 possible values: a, b, and c. H the frequency of both 
a and b was 4, and the frequency of c was 2, you 

say that both a and b were modes in that 


could/ could not 


distribution. 


Consider the table of data shown below. 


Observations Value 


Ne 


CO Ç — O5 0 Hc 


= 


The value 11 has a frequency of 
The value 21 has a frequency of T 


The value 33 has a frequency of 
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28 


29. 


(Continued) 


The two modes in this collection of data are the values 
and C I1 21 

It would be possible to find the value that occurred more 

frequently than any other value in either numerical or 


non-numerical data. 


Therefore, a collection of non-numerical data can be 

described in terms of the modal, or most frequently 

occurring value. For example, if you tossed a coin 

5 times and observed 3 " heads" and 2 "tails," you could 

say was the modal (most frequently heads 


heads/tails 


occurring) value in your data. 


Although the mode is often a useful way of characterizing 
the central tendency of a distribution, it is sometimes 
misleading to describe a distribution by its mode. 
Consider the two frequency distribution graphs shown 


below: 
Š 50 Š 50 
40 40 
H 30 Ë 30 
20 g 20 
Ë: 10 Ë 10 
0172345 012345 
VALUE VALUE 
GRAPH A GRAPH B 
Notice how the mode of Distribution is near the A 


A/B. 
center of the distribution and is typical of the values in 
that distribution, whereas the mode of the other 
distribution is not near the center of the distribution and 
is not particularly typical of the values observed in that 


collection of data. 
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31. 


32. 


Partiy because the mode of a distribution is not always 
representative of the central or typical value in a 
distribution, statisticians have defined other ways of 
characterizing the central tendency of a distribution. 


Another way of representing central tendency is to report 
a value which is smaller than the same number of 
observations as it is larger than. This value is called 
İhe median of the distribution. For example, suppose 
your data consisted of the observations 4, 5, 7, 8, and 


10. The value 7 would be greater than of the 
3/2 


remaining observations and smaller than of the 
3/2 


remaining observations. Therefore, vve could call 


the median of these five observations. 


The easiest way of finding the median of a collection of 
data is to list all the observed values in the order of their 
size. This procedure is called ranking the data. If your 
data consisted of the values 4, 3, 8, and 7, then the list 


of values 8, 7, , and would be a ranking of 
3/4 3/4 


the data. 
H you vere to rank the values 10, 6, 11, and 4, you 


would start with the largest value and end with the 


smallest to form the list , F , and 
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35. 


36. 


You could rearrange the observations in a table of 

raw data, arranging the values in the order of their size 
rather than in the order in which they were observed, 
Thus, we would be forming this new table by ranking the 


observations. 


Notice that the data shown in Table B 


were/ were not 


formed by ranking the data in Table A. 
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1 
d 2 
3 11 3 
4 8 4 
TABLE A TABLE B 
Table D (shown below) formed by ranking 


was/ was not 


—- 


the data in Table C. 


d 
> r 2 
3 11 3 
4 8 4 
TABLE C TABLED 
We stated earlier that the median value in a collection of 
data was a value which was than half of the 
other observed values and than the other 


remaining values. 


One vray of finding the median of a collection of data is to 
rank the data and locate the value which divides this list 
of ranked values in half. If 10, 7, 6, 2, and 1 were a 
list of ranked values, the value — would divide the 
list in half. Two of the other values would be larger 


than 6, whereas of the remaining of the other 


(number) 
values would be smaller than 6 
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were not 


Was 


smaller 


larger 
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(Continued) 


Since 6 is larger than half of the remaining values and 
smaller than the other half, the median would be 5 


It is a simple matter to find the median of a distribution 
when you have an odd number of observations. You 
simply rank the observations and find the middle value 
in this list of ranked observations. This middle value 
would be the of your data. 


If you had an even number of observations, there would 
not be a value in the list of ranked data such that the 
same number of observations fell above and 


below that value. 


To illustrate this problem consider the table of ranked 


data shown below: 


Notice how we have indicated the rank of each value in 
the first column of the table. The largest observed 
value has a rank of and the smallest observed 


value has a rank of š 
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Notice also that there are observations larger 
than the value 40 and observations smaller than the 


value 40. Thus, 40 the median. There are — 


is/is not 


observations larger than the value 30 and _ _ observations 
1ess than the value 30, indicating that 30 the median. 


is/ is not 
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(Continued) 


Neither 30 nor 40 is the median, since too many values 


are smaller than whereas too many values are 
30/40 


larger than 


30740 


Strictly speaking, any value between 30 and 40 could be 
called the median of this data. However, statisticians 
have agreed upon a rule for finding the median of an 
even number of observations. They would say that the 
median of the previous collection of data was a value 
halfway between 30 and 40. In other words, 35 

be called the median, because 


would/ would not 


35 is halfway between 30 and 40. 


Suppose 8, 6, 4, and 3 were ranked data. The value 
halfway between 6 and 4 would be the value 2 
Therefore, 5 would be the of this data. 


Three lists of ranked data are shown below. 


Data A: 5, 4, 1 

Data B: 6, 3, 2 

Data C: 5, 3, 1, 1 
The middle value in Data A is the value 4, therefore, 4 
would be the of Data A, 


The middle value in Data B is the value - This value, 


therefore, would be the of that data. 
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44. 


45. 


(Continued) 


Notice that Data A and B consist of an number 


odd/ even 
of observations, whereas Data C consists of an 


number of observations. Using the rule for 
odd/ even 


data consisting of an even number of observations, we 
would find the value halfway between and $ this is 
the value . Therefore, we would say the median 
of Data C is . 


Consider the table of data shown below: 


The largest value in this table is and the 


smallest value is * 


H we rank the data in the preceding table, our list of 
values be 200, 160, 180, 100, 30. 


would/ would not 


The previous list was incorrectly ranked. The value 
"180" should have come before rather than after 


it, since 180 is larger than " 


We refer to the largest value in a table of ranked data 

as having rank 1. The value having rank 1 in the previous 
collection of data would be. The value having 
rank 2 would be —  . 
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odd 


even 


200 
30 


would not 


160 
160 


200 
180 
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48. 


49. 


50. 


51. 


The median of the previous collection of data is the 
value , Since the values and are 
larger and the values and are smaller 


than this median value. 


Hovvever, if the values in the previous collection of data 
were 200, 180, 160, and 100, the median would be a value 
halfway between and - 


To find the difference between 180 and 160, you subtract 
160 from 180. Half of this difference is , Which 
is what you would add to the value 160. "Therefore, the 


median is > 


We have now considered two ways of representing the 


central tendency of a distribution: by the and 
by the 5 
The of a distribution is the most frequently 


mode/ median 


occurring value in the distribution, whereas the 
is a value that is less than half of the other 


median/ mode 


values and greater than half of the remaining values. 


Suppose your data were the values: 


10, 5, 1, 10 and 4. 
The median of the distribution is the value ; 


whereas the mode is the value : 
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53. 


54. 


55. 
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The mode of the collection of data shown above is 


since this value occurs times. 


The value 9 the median of the previous 
is/ is not 


collection of data since there are observations 
with values larger than 9 and observations with 


values smaller than 9. 


While 9 is too large to be the median, 7 is too 


to be the median. 


To find the median of the previous data, you would find 
the value halfway between and - Therefore 
the median of the previous collection of data is è 


The median (like the mode) can sometimes give a 
misleading picture of a distribution. For example, 
consider the two collections of data shown below: 

Data A: 100, 99, 98, 97, 96. 

Data B: 100, 99, 98, 4, 2. 
The median of Data A is and the median of Data B 


is 
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is not 


small 


98 
98 


OSs 


*Seuroojno e[qrssod ieujo am ucu) uəşyo 
910ui 1101 30p-XIS 10 1op-əuo z oonpoid o) pəpuə91 qət 


PE PA 
a euo “ərp pəstfq z jo 1otAeuoq əy} s1so33ns Uude» 
ur UMOYS uorjqrijsrp ÁAjutqeqoid əy} seəcəqA “ərp 
178} £ yo 1orAvQeq əy} Surqr10sep se jo 3u8nouq aq pinoə 


a/v s 
v udeur) ur uorjnqrnstip Ajpqeqoad əy} emt, sun 


Suo[ ən) ur samt yo uorrodozd aures am moqe 1n220 
P[NOM souroojno o[qrssod xIs əv) jo qəvə yey} şəədxə 
Kprexeue3 ppnoA nok 'Apredoad payor pue Ap[rodoud opeur 
919A ƏID € JI “ərp 4187 € yo vəpi Tensn au uru ?üə3sisuoə 
əq JOU prnoA Joe spy} ur pəseyəq yey} erp au] 


uey} Seat /ueq3 1937918 
uey} ssər ` ou Lees ənreA 
€ yovordde o) pəpuə1 per[ox ƏrzəA SJOP 043 yey} sour 
H Jo uoryrodoad ay} opua ^ — on[ea oy} yovordde 0} pəpuəş 
pərror sem op euo səur jo uonurodosd əy} ^«peseaiour 
SEA ƏTp ƏY} Jo sryor Jo rəqumu ən) se Jey} 3sə33ns prnom 


(əAoqe) g udexo əy} ur umoys vomnqınsıp Drroeoozd əql 


uonnqrajsrp `V udeur) ur umoys 
AyTIqeqoad 


əm Aq pezr1o3oe1vuo 
‘ssaooid uropuex 1v[noryied sty} yo o1njegj ay} st doS1v[ 
Seuro29q suortAresqo yo IQUMU om se H Jo enpea e 
qoto1dde oj euroo3no erpqrssod qəvə yo uorj1odoad am 10] 
Kouepuej eu, “ərp z yo siyo pojeodox Jo onstuojoedeuo e 


Squese1dor y udez5 ur uopjngpnrspp Ayriqeqoud əm) “nur 


mio 


> = (S}0p xis) xq 
= (s1op ƏA1J) zd = (s1op In0j) iq = (s3op aam) Id 
= (s1op 043) zd = (10p Əuo) Ig ‘y uduzD ol Surproooy 


“86 


"Lë 


“96 


"$6 


 ————— ə dn sənin 


56. 


57. 


VVhile both distributions have the same median, the 


values belov the median in Data differ much 
A/B 


more from the median than do the values above it. 


The median only indicates the value dividing the list of 
ranked data into two equal parts. The median does not 
indicate how much smaller or how much larger are the 
values falling above it or belov it in the list. This 
can be illustrated by the following raw data graphs. 


GRAPH A 


VALUE 


1752773074 5677 8-9 
OBSERVATION 


GRAPH B 
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OBSERVATION 


Notice that in both collections of data — — — 
observations had a value larger than the value of 
observation 3and ^ observations had a value 
smaller than the value of observation 3. Thus, 

is the median of both distributions since it is the value 
of observation __ . 
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58. 


59. 


60. 


61. 


The table of data shovn belov contains 
observations of a variable. 


numerical/ non-numerical 
Observation Value 


The graph of raw data shown below 
does/ does not 


represent the data in the previous table. 


VALUE 


Ço o $o HO O5 —<1 Co 


1 2 3 
OBSERVATION 


The previous graph does not represent the previous 
table of data, since observation 1 had a value of 5 and 
observation 2 had a value of 8 in the table, whereas 
observation 1 had a value of and observation 2 
had a value of `  inthe graph. 


The graph of raw data shown below 
does/ does not 


represent the data shown in the previous table. 


VALUE 


QC = b 0546 n o — $o 


1 2 3 
OBSERVATION 
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61. 


62. 


(Continued) 


All the observed values in the previous graph 


are/ are not 


the same. 


Differences in the observed values are indicated on the 


graph by differences in the of the 3 columns. 


Thus, the value of observation 1 is clearly less than 
that of observation 2, since the height of column 1 is 
less than that of column 2. Similarly, the value of 


observation 2 is than the value of 
less/ greater 

observation 3, since column is higher than 

column 


While you can always compare the value of one 
observation to the value of another observation, it 
wouldn't make much sense to simply say, "Observation 
3is less than." The immediate question would be: 
"Less than what?" It is often convenient to pick a 
reference value for comparison with the observed values. 
For example, suppose you chose the value 4 as a 
reference value. You could then describe the data in 

the previous graph by saying, "Observation 1 is greater 
than 4, observation 2 is greater than 4, and 


observation 3 is than 4," 
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63. 


64. 


The relationship between the reference value 4 and the value 
Of each of the three observations is made clear by the 


following graph. 


— -REFERENCE 
VALUE 4 


VALUE 
CD n b c» gi O> -3 CO 


1 2 3 
OBSERVATION 


While this graph represents the same data as does the 

previous graph, we have indicated the reference value 4 

with a dotted line drawn across the graph at a height 

equal to the value ` . We have also indicated the 4 
difference between the reference value and each observed 

value with an arrow. The arrow in column 1 is pointing 

upwards, since the observed value 5 is greater than the 

reference value 4. Similarly, the arrow points 


in column 2, since the value of observation 2 up 
up/down 
is than the reference value 4. greater 


greater/ less 


The value of observation 3 is than the less 


1ess/ greater 
reference value 4. Therefore, the arrow on our graph 


points , Notice that the difference between the down 
up/ down 


value of observation 1 and the reference value is less 
than the difference between the value of observation 2 
and the reference value. This is indicated on the graph 


by the fact that the arrow in column 1 is shorter 


longer/ shorter 


than the arrow in column 2. Thus, the size of the 
difference between the observed value and the reference 
value is indicated by the of the arrow length 


representing that difference. 
133 


976 


0} 19SO[9 pu? 19SO[O MOIS 0) SurəəS speəu yo uorj1odoad 
əm “pəsvərəu1 ST Sasso} yo 1oquinu ay} se yey} Aes 

ued noÁ “sso) 1e[norjied Aue uo INDIO [TIA EVA Anəexə 
jorpaid şovuvə no4 ott * SSO} 1e[norj1ed Aue yo əuroəno 
əm) Söururəəuoə ÁAjurej1ooun əy} jnoqe Surgjeuros Aes 0j 


Sn SAO[[? UTOD V JO Sasso} PIAA jo ornjsr193oedeuo STU, 


: yur 


Á[91eurrxordde sem uroo € Jo SƏSSO) jo zəqurnu Dorun 
ue yo Surjsrsuoo uonerndod am ut uorj10doad uoryeyndod 
eu yey} reədde prnom 1 “əzoyərəq) “Əlqe) snorAa.1d 

ən) o) Surpzooəoy “pəssərəui sr o[dures ən) jo əzts ayy 
sv yovordde o) span ən)srye?s ər[dures ən) YIM onpeA 
yey} se pəuryəp sr uorjerndod əşruryür ue yo uorj1odoad 
uorjyerndod eur, -uorjrodo.rd uorjyepndod ay} eutjop 

0} Sn SMOJTE Fey} peseo1out ST əzts o[dures əm) se onpeA 
Apeəşs € qoeordde oj uorjerndod əşruryür ue wory So[dures 
uropuer jo stets o[dures əv) jo A9uəpuə1 sty} st I 


*epdures 
əm) Jo ezts ou pesva.rour noÁ se orjsrejs uorerndod 
əm) Jo engpeA Əy} o) 1eso[o pue səsorə Surqoeoadde ərəm 


szequinu onsnejs əfdures əv) yi SE STW ` u 
Eka 1 JO MET əm) Jo ə[durexə ue st poseaourt st 
əldures əm) yo ozts ay} SE ƏNTLA Ápvejs v qotoudde o} spuəş 


(uor10doad e se yons) onsre1s o[dures e yey} JOR] eur 


ER 3 JO enpea € eau əməs 
0} pəurəəs put ssər pue ssa] aäueu o) pəpuə uorrodoad 

ərdures əy} 'odures ay} ur pəpnrəour ərəm suorjeA1asqo 

əvour put JIOW se mem 1eo[o ST 3r “(€ uumroo) 

suorodozd ərdures yo uurnjoo ən) umop Sursoo'T 


"Lg 


"98 


"68 


65. 


66. 


The length of the arrov represents the difference between 
the reference value and the observed value, regardless 
in which direction the arrow is pointing. Thus, the 
arrow points up in column 2 since the Second observed 
value is greater than the reference value, whereas the 
arrow points down in column 3 because the third 
observed value is than the reference value. 
However, the value of observation 3 is closer to the 
reference value than is the value of observation 2. This 
is indicated by the fact that the arrow in column 

is than the arrow in column 2. 


shorter/ longer 


Differences between observed values and a reference 
value (represented with arrows in the previous graph) 
are referred to as deviations by statisticians. Ifa 
value is greater than the reference value, that value has 
a positive deviation from the reference value. Ifa value 
is less than the reference value, that value has a 
negative deviation from the reference value. Since the 
value of observation 1 (on the previous graph) is greater 
than the reference value 4, you would say that the value 
of observation 1 had a positive deviation from the 
reference valte 4. Similarly, since the value of 
observation 2 is than the reference value 4 


greater/ less 


you would say the value of observation 2 had a 
deviation from the reference value 4. 


positive/ negative 


Since the value of observation 3 is less than the 
reference value 4, we would say the value of observation 
3 had a deviation from the 


pos itive/ negative 


reference value 4. 
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67. 


68. 


69. 


The amount by which a particular value deviates from 
the reference value is indicated by the length of the 
arrow in the previous graph. Whether the deviation is 
positive or negative is indicated by the direction in 
which the arrow points, In other words, whenever an 
observed value is greater than the reference value, the 


indicating a 
up/ down positive/ negative 


deviation. Whenever the observed value is less than the 


arrow will point 


reference value, the arrow will point indicating 
up/ down 


a deviation. 


positive/ negative 


The graph of raw data shown below represents 2 
observations having a value of and . 


6 
5 
4 
VALUE 3 
2 
1 
: 1 2 
OBSERVATION 


Suppose you chose 0 as the reference value for 
describing the two observations in the previous graph. 


Both of the observed values would be than 


greater/ less 


the reference value and would represent e s 
positive/ negative 


deviations from the reference value. 
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71. 


72. 


Suppose you choose 8 as a reference value. Since the 
value of both observations is than 8, both less 


greater/ less 
negative/ positive 


deviations would be negative 


Suppose you chose a reference value somewhere between 
6 and 2. The value of observation 1 would represent a 


deviation from the reference value, positive 
positive/ negative 
whereas the value of observation 2 would represent a 

deviation. negative 


positive/ negative 


We have redravn the preceding graph and indicated below 
the deviation of each observation from a reference value 


of . 


VALUE — REFERENCE VALUE 


Ço t> $o > n o; 


1 2 
OBSERVATION 


The reference value is closer to the value of observation 
2 than it is to the value of observation 1. This is 
indicated by the fact that the arrov/ pointing up is 


than the arrow pointing down. We can longer 
Tonger/ shorter 
calculate the actual size of the deviation simply by 
subtracting the reference value from the observed value. 
For example, you would calculate the deviation of 
observation 1 from the reference value by subtracting 3 


from . 
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73. 


74. 


75. 


76. 


T1: 


The deviation of observation 1 from the reference value 3 
equals 3 subtracted from 6, which equals . 3 


Similarly, you would calculate the deviation of 
observation 2 from the reference value 3 by subtracting 
3 from S 2 


Notice that 2 minus 3 equals -1, This means the deviation 
of observation 2 from the reference value 3 is A Sch 


-1/41 
When we subtract a reference value from a smaller value, 


our answer will be negative. Therefore, all negative 


deviations will be represented by negative 


positive/ negative 


numbers. On the other hand, all positive 


deviations will be represented by positive numbers, since 
the observed value is, in this case, greater than the 


reference value. 


In the table shown below, we have summarized the 
information concerning deviations of the observed values 


from the reference value 3. 


OBSERVATION VALUE 
1 
2 


We have simply added another column to the previous 
table of raw data and listed the deviations of each of the 


observed values from the value — — . Thus, the 3 
numeral -1 in the last row of the third column represents 
the deviation of observation 2 from the reference value 3. 
We obtained the deviation of observation 2 from the 
reference value 3 by subtracting  Irom , to 3, 2, 


give us an answer of _ — 
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79. 


80. 


81. 


82. 


83. 


The folloving table contains the same tvvo observations. 


However, we have left room to indicate deviations from 


the reference value rather than 3 (as in the 


Jmm table). 


VIATION 
İoasenvamrov | RVATION VALUE bro 4 


To find the deviation of observation 1 from the 


reference value 4, you subtract 4 from ` 


6/2 
Therefore, the deviation of observation 1 from the 
reference value 4 is 5 
The following table 1715 35i correct. 
is/is not 


DEVTATION 
OBSERVATION LUE FROM 4 


The previous table is incorrect because observation 2 


was smaller than the reference value. Therefore, the 


deviation of observation 2 from the reference value 


would have to be represented by rather than 
2. 
The folloving 20. correct. 

is/ is “is/ is not 


DEVIATION 
OBSERVATION VALUE OM 4 


1 
2 
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84. 


85. 


86. 


87. 


88. 


There is something special about using 4 as the 

reference value. The positive deviation represents 

the same difference between the observed value and the 

reference value as does the negative deviation. In other 

words, if we represented deviations on a graph as we 

did previously, the length of the two arrows would be 
(although one arrow would be 


the same/ different 


pointing up and other down). 


Because the reference value 4 has the unique property of 
being as close to the value 6 as it is to the value 2, we 
say that 4 is the mean of the values 6 and 2. Since 6 and 
2 are the two observed values in the previous collection 
of data, we could say that the mean of that collection of 
datais  , 


H ve added the positive deviation from 4 and the negative 
deviation from 4, our answer would be 6 because 
+2 added to -2 equals . 

Another way, therefore, of describing that unique 
characteristic of the reference value 4, which makes it 
the mean of the previous collection of data, is to say the 
sum of the deviations from 4 equals — 


Suppose your data consisted of 3 observations instead of 
only 2 — for example, 8, 3, and 1. H you chose 10 as 
your reference value, all the deviations would be 


. If you chose 0 as your reference 
positive/ negative 


value, however, all the deviations would be 


positive/ negative 


DEVIATIONS 
OBSERVATION | VALUE FROM 6 
1 8 
2 3 
3 1 
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the same 


negative 


positive 
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93. 


To check the deviation of observation 3 in the previous 
table, you would subtract the value from the 
value 8. This would indicate that the deviation in 
the table was D 


correct/ incorrect 


We mentioned that a reference value is called a mean 
when the sum of the deviations from that reference value 
equals 0. For the set of data we considered earlier, for 
example, the deviations from the mean value 4 were +2 
and -2, giving a sum of deviations equal to (+2) + (-2), 


which equals . 


Even when the data consist of more than two observations, 
we can define the mean in the same way. In other words, 
if ve add all the deviations from a particular reference 
value and our answer is 0, that reference value is the 
mean of those observations. Consider the previous table 
of deviations from the reference value 6. To find the 
sum of the deviations, we would add +2, -3, and -5, 


which yields an answer of S 


Since the sum of the deviations of our three observed 
values from the reference value 6 does not equal 0, the 
value 6 the mean of these three observations. 


is/is not 


VVe could record the deviations of each observation from 
the reference value in the folloving table: 


DEVIATIONS 
OBSERVATION VALUE FROM 4 
1 8 
2 3 
3 1 


We would find the deviation of observation 1 by subtracting 


irom , indicating that the deviation of observation 1 


from the reference value was 
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correct 


is not 
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In the same way,we would find that the deviation of 
observation 2 was and that the deviation of 


observation 3 was . 


We have recorded the deviations from the reference 


value 4 in the folloving table: 


DEVIATIONS 
OBSERVATION VALUE FROM 4 


1 
2 
3 


We said that the mean or average of a group of numerical 


observations is that particular reference value yielding 
deviations whose totalis 0. To find the total of the 


deviations from the reference value 4 for the three 


observations in this table, we would add 5 , and 
Thus, 4 the mean of this group of observations. 
is/is not 


We can illustrate this graphically as follows: 


8 POSITIVE NEGATIVE 
7 DEVIATION DEVIATION 
ui 
25 
d 4 
ES 
2 
1 
02078 
OBSERVATION 


The arrow labeled "positive deviation" represents the one 
positive deviation in the graph. The two downward 
pointing arrows, which are connected together, represent 
the two negative deviations in the graph. In order for the 
sum of the deviations to equal 0, the positive deviations 
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96. 


97. 


98. 


(Continued) 


must exactly balance the negative deviations. In other 

words, the total length of the arrows representing the 

positive deviations must be exactly the as the same 
total length of the arrows representing negative 

deviations; this is apparently true when the reference 

value is 4. Thus, ^ is the mean of these three 4 


observations. 


Consider the following two graphs. 


E B 

2 2 

d 2 

x < 

E > 
123 12223 

OBSERVATION OBSERVATION 
GRAPH A GRAPH B 


Remember, the mean is that reference value from which 
the sum of the deviations equals 0. Keeping this in mind, 


it would appear that the reference value shown in Graph B 
A/B 


was more likely the mean than is the reference value 


in the other graph. 


The reference value in Graph B is more likely the mean 

because the two positive deviations added together almost 

equal the negative deviation in size, whereas the two 

positive deviations in Graph A added together appear to 

be than the size of the negative larger 


larger/ smaller 


deviation in that graph. 
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99. The mean is a useful way of representing the central 
tendency of a distribution. However, we do not yet 


have a simple way of calculating the mean from an 

actual collection of data. Of course, you could try one 

reference value after another, calculating the deviations 

from each of these reference values. You could 

eventually locate a reference value from which the sum 

of the deviations equaled , which would zero 
indicate that reference value was the mean of the 


collection of data. 


If the collection of data consisted of only a few 

observations,you could probably find the mean in this 

manner. However, if the data consisted of many 

observations, the procedure outlined in the previous 

frame would not be practical Therefore, we will 

consider a procedure whereby we can calculate the 

mean of any collection of data according to a simple 

rule. A rule for calculating the value of some statistic 

is called the formula for that statistic. Thus, a rule 

for calculating the mean would be called a formula 


for the mean. 


100. In order to discuss ways of calculating specific 
statistics, it is often useful to talk about data in general 
rather than about particular observed values. For 
example, consider the two tables of raw data shown 
below. 

TABLE A 230. B 


1 


2 à 
3 3 
4 4 


Each table lists the values for observations of a 4 
476 

numerical variable. (You might wish to insert a book- 

mark here since we will refer to these tables in later frames. ) 
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The term "observation 1" specifies a particular value 
in each table. It specifies the value 20 in Table A and 


the value in Table B. 


Similarly, the term observation 3 refers to the 


value in Table A and the value in Table B. 


Instead of writing out observation 1 or observation 2 


statisticians have found it simpler to use the symbol Xi 


to represent the same value as does the term 


observation 1. Thus, X represents the value 20 in 


the previous Table A and the value in Table B. 


The number which appears after and just below 
the capital letter X in X, is called a subscript. The 


subscript indicates which particular observation you are 


representing. Thus, the subscript in Xi 


indicates you are representing observation 1. 


Notice that the symbol X, has a s 2 instead 
of a subseript 1. 
Similarly has a subscript 3 instead of a 


X, /X, 


subscript 2. 


Since the symbol X, indicates the "observation 1," you 


would use the symbol x, to represent " observation . 


Notice that the symbol x, can represent the 


HI 


first/ second 


observed value in any table of raw data, regardless of the 


particular value of that observation. 
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m the same way, you could represent the third observation 
in any table of raw data by . The subscript X, 3 
X X 
3 
indicates you are referring to observation 3. 


Considering the two previous tables of data, the term X, 


would refer to the value 20 in Table A and the value 10 
in Table B. 

X3 would refer to the value in Table A and the 42 
value in Table B. 0 


Notice how you could represent any collection of three 


observations with X, Xs, and Za 


If your data consisted of Xp Xə Xy and X,, then there 


would be observations in your collection of data. four 


Statisticians often use a capital letter N to represent the 
number of observations in a collection of data. If the 
collection of data consisted of four observations, N 
would equal 4. Ifthe data consisted of ten observations, 


N would equal : 10 


Since N represents the of observations in a number 
table of raw data, you could represent any collection of 
N observations with a column of X' s starting with X, 


then Xo and so on, until Xx 


This way of representing tables of data is useful because 
you can describe a general rule or formula for calculating 
some statistic without speaking of particular values. For 
example, you could represent the procedure for getting 
the sum or total of a collection of five observations with 


the formula: 


Totale X, + Xj, + + + Ze Xs X, 
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You could write a general formula for finding the total 
of any collection of data consisting of three observations 


as follows: 
Total = Xi * Xə + X4 


It would become awkward to write a formula of this sort 

if the collection of data consisted of very many observations. 

For example, if the data consisted of one hundred 

observations, we would have to write out a string of X 

values with plus signs between them, beginning with Xi 

and ending with an X having the subscript  . 100 


One way you could simplify the formula would 
be to write out the phrase: sum all the X' s, which would 
mean together the values of all of the add 


add/ multiply 


observations in the table of data. 


Statisticians have a special shorthand way of writing the 
phrase: sum allthe X's. Instead of writing out the 
whole phrase, they simply write D X. Thus, if your 
data consisted of two observations, x would equal X, 
plus Xo. If your data consisted of three observations, 


x vvould equal + + 3 Xp X», X 


You are now in a position to write a simple 

formula for the total of all the observations in any 
collection of data. Using the shorthand way of writing 
" gum all the X' s," you could write the formula for the 


total as 


Total = x E 
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The symbol on is a capital Greek letter called sigma. 

Thus, the Greek letter in the expression ə x sigma 
indicates to you that the expression represents the _ sum 
of all the X" s (observations). 


The Greek sigma 02) is often referred to as a 
summation symbol, since D: X represents the sum 


of the observed values. 


Let's return now to the original task of finding a formula 
for the mean. Basically, this involves stating the 
formal relationship between the mean of the data and the 
observed values by a mathematical equation, and then 

(by simple algebra) rearranging this equation until it is 


in the form of a formula for the mean. 


We shall begin by considering a collection of data 
consisting of three observations. In other words, 
N= for this collection of data. three 


Just as we can represent observation 1 by X without 
indicating any particular value, we will represent the 
mean by x. Inthis way, we can represent the mean 


without indicating any particular for the mean. value 


You could represent the deviation of observation 1 from 
the mean as X -x. Ina similar way, you would 


represent the deviation of observation 2 from the mean 


as( = X). X) 


If the collection of data consisted of three observations, 
you could represent the sum (total) of the three deviations 


from the mean by Gu - x) + (X, - x) + ( $+ ocu). Xə 


yel 


If your three observations had the values 10, 2, and 4, 

you could put these actual values in the previous formula 

and write it as (10 - x) +(__-  +(__-»®)- 2. 4 
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According to our previous discussion of the mean, you 
know the sum of the deviations from the mean must equal 
Therefore, the folloving formula 
could/ could not 


be true if x is the mean of these three observations: 


(10 - x) + (2- x) + (44-x) = 5 


If your data consisted of three observations and the 
following equation were true, 10 would be the 


of these three observations . 


Gu m 10) + X; s 10) + (X, - 10) - 0 


In other words, without actually stating the values of the 
observations nor the actual value of the mean, we know 
(from the definition of the mean) that tne following 
equation is for any collection of three 


observations. 
(K, - x) + (K - x) + (X,- x) = 0 


According to simple algebra, we remove 


could/ could not 


the parentheses in the previous equation and write it as 


follows: 


Furthermore, we could rearrange the symbols on the 
left-hand side of the previous equation so that the 


equation reads as follows: 


Ke pr XU - 0 
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130. Remember, the left-hand side of the previous equation 
is the sum of the deviations from the mean. Notice how 
this sum is made up of the total of all the scores from 


which we subtract the mean times. three 


three/four 


131. We can represent the total of the three observations by 
27 X. Also, subtracting the mean three times is the 
same as subtracting 3x. Therefore, we could write the 


previous equation as x = x= 0. 3 


132. The previous equation says that when we subtract 3x 
from the sum of the three observations, our answer is 
zero. Therefore, the total of the three observations 


Q, X) must be exactly equal to 3 ç x 
133. We started with an equation that said the sum of the 
deviations from the mean equals zero, and we have 
proceeded to the following equation: 
XX= 8x 
If we divide both sides of the previous equation by 3, 
the result would be 
DUO Ug 
3 3 
The 3's on the right-hand side of the equation cancel 
each other out, leaving us with the equation: 
m= Ë 
PLA a Riya ıı x 
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“ 
This equation is a formula for the mean, since it says 


that if we added together our three values and divided 
this sum by the number of observations (3), the result 


vvould equal the mean. 


Let! s see if this formula works for a particular example. 

Suppose your data consisted of the values 2, 8, and 2. 

The formula says to first add the observations. This 

would give you a sum of . Dividing this sum by 3 12 


would give you a result of 5 4 


In other words, 


Therefore, according to the formula, the mean of the 

three observations is . We can check to see whether 4 
4 really is the mean oí the previous observations by 

considering the following table: 


DEVIATION 
OBSERVATION VALUE FROM 4 


1 
2 
3 


The sum of the deviations from the reference value 4 


equals , Therefore, ve know that 4 is the zero, 


of the 3 observations. 
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136. The formula we derived was for a collection of three 
observations. Suppose the data consisted of 4 
observations. We could write the following equation, 
which specifies the relationship between the mean and 
those 4 observations: 
(xS x Xə 0 


197. VVe could remove the parentheses in the previous 
equation and rearrange the terms so that the equation 
reads as follovs: 


X, + X, + X, + X, - x - = Ë” Ew 


E) 
E 


138. Using our shorthand way of writing this, we could say 


that: 
X-4x = 0 3: 


139. Remember, the sum of the deviations from the mean 
equals Dx - 3x when the data consists of 3 observations, 
and zə - 4x when the data consists of 4 observations. 
No matter how many observations there are in the 
collection of data, it is not hard to see that the folloving 


equation be true, would 
yrould/ would not 
yx - Nx = 0 


where N is the number of observations. 


140. Just as we did earlier, we could rearrange this equation 
to read: ))X = x. Then, dividing both sides of the N 
equation by N, we would finally arrive at the following 


formula for the mean: 


béi 
ül 
Ki 

Ei 
ül 
ədl 
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144. 


(Continued) 


This formula says the mean of a collection of N 
observations is equal to the total of all of the observations 
divided by the of those observations. 


Thus, if you had a collection of 10 observations and the 
total of all the observations vere 50, the mean of the 
observations would equal divided by e In 


other words, the mean would equal 


Let' s test this formula on the following collection of 


5 observations. 


DEVIATIONS 
OBSERVATION FROM 6 


Notice that); X = for this data (because ),X is 
a shorthand way of writing " sum of all the values"). 


ə equals 30, and N equals ;thus, our formula 
for the mean says to divide by , which 


would give a result of 


The formula says 6 is the mean of these five observations. 
In the third column of the table, we have listed the 
deviations of each observation from the reference value 

It is clear that the sum of these deviations equals 


This indicates that 6 the mean of 


is/ is not 


these 5 observations. 


152 


number 


50, 10 


30 


30, 5 
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145. We have considered three ways of representing the central 


central tendency of a distribution: the m F 


m and the m o median, mean 


the mode 


146. Often a distribution vill have a different mean, median, 
and mode. The data shovn in the previous table, 


however, has a mode of , à median of , and 6, 6 
a mean of 
147. Six is the of the previous collection of data mode 


since this is the most frequently occurring value. Six 

is also the of the previous collection of median 
data since there is, in the collection of data, one value 

greater than six and one value less than six. Finally, 6 

is the of the previous collection of data mean 


because the sum of the deviations from 6 equals zero. 


148. The mean is a useful way of representing the central 
tendency of a distribution because its value depends on 
every value in the distribution. The value of the 
is determined only on the basis of mode 


mean/ median/ mode 


the most frequently occurring value in the collection of 
data. Finally, while you know that there are the same 


number of values above the as median 
mean/ median/ mode 
there are below it, you know how far above do not 


do/ do not 


or how far below the median these values are. Thus, 
each of the various ways of representing the central 
tendency of a distribution has its own peculiar features. 
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Section IV: Variability 


We have just seen that the mean, median, and mode are 
three that characterize the 
of a distribution. 


Besides their central tendency, there is another 
important characteristic of distributions, one that is not 
r epresented by either the mean, the median or the mode. 
Some collections of data are composed of many similar 
values, while in other distributions the values might vary 
considerably. For example, the values in Data 

ABIT 
(below) are all very similar to each other, while the 


values in Data are much more widely separated. 
B 


Data A: 20, 21, 20, 19, 20 
Data B: 2, 38, 20, 5, 35 


We have listed the two previous collections of data in 


the following tables: 


Observation Value Deviation From 20 


20 
21 
20 
19 
20 


TABLE A 
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(Continued) 


Observation Deviation From 20 


-18 
418 
0 
=15 
+15 


TABLE B 


Notice that the tables indicate the deviation of each value 
from the reference value 


The sum of the deviations in Table A is 
Therefore, the mean of Data A is : 


The sum of the deviations in Table B is ; therefore, 
the mean of Data B and the mean of Data A are 


the same/ different 


While the means of both distributions are identical, 

there is an interesting difference in the two distributions. 
If we ignore whether or not a deviation is positive or 
negative and only consider its absolute size, it is clear 


that the deviations in Table tend to be larger 
A/B 


than the deviations in the other table. In other words, 


the values in Table are more spread out 
A/ B 


(dispersed) around the mean value of 20. 
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The manner in which the deviations from the mean are 
different in the two previous collections of data can be 


illustrated by the following graphs of raw data. 
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GRAPH A GRAPH B 


We have represented deviations from the mean with 
arrows, as we have done previously. H we ignore the 
direction in which an arrow is pointing and only consider 
its length, it is clear that the deviations from the mean 


in Data are larger than those oí the other 
A/B 


collection. 


Notice how the heights of the columns in Graph 
A/B 


are very similar, whereas the heights of the columns in 
the other graph tend to change or vary much more from 


one observation to another. 


Since the height of a column represents the value of that 


observation, we could say that the values on Graph 
A/B 


change or vary more than do those on the other graph. 
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10. 


11. 


12. 


The more the values in a collection of data vary, the 
more variability that collection of data is said to have. 


Thus, we would say that the variability of Data 
A/B 


was greater than the variability of the other collection 
of data. 


Data composed of many widely different values is said 
to have a great deal of variability. On the other hand, a 
collection of data in which the values are very similar 
or close together could be described as having little 
variability. Therefore, of the following two collections 


' of data, Data would be described as having more 


A/B 
variability than the other collection. 


Data A: 10, 9, 10, 10, 11 


Data B: 2, 18, 10, 1, 19 


We could illustrate the difference in variability of the 
two previous distributions with the following graphs: 


20 20 
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GRAPH A GRAPH B 


Earlier we defined a variable as something that changed 
or varied. The more something changes or varies, the 
more variable it is said to be. Thus, the observed 


values are more variable in Graph than in the 
A/B 


other graph. 
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13. 


14. 


15. 


16. 


The more variable the values in a collection of data, 
the more varlability that collection is said to have. 
In other words, the more the values in a collection of 


data change or vary from observation to observation the 


more that collection is said to have. 


If all the values in a collection of data were very similar, 

the data would not have much variability. If all the values 
are very similar, you would expect the difference between 
the largest value and the smallest value to be 


In other words, if the difference between the largest and 
smallest value in a collection of data were very small, 
the data have much variability. 


would/ would not 


We often use the difference between the largest and 
smallest value in a collection of data as a statistic 
representing the variability of that data. We call this 
statistic the range. In other words, you could represent 
the variability of a collection of data by finding the 
difference between the largest and the smallest value in 
that collection of data. This difference is a statistic 


called the 


If your data consisted of the values 100, 50, 10, 75, and 
80, then the of these values would be 90, 
since this value is the difference between the highest 
value (100) and the lowest observed value (10). 
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variability 
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17. 


18. 


The relation between the range and what we have referred 
to as the variability of a distribution can be illustrated 
on the following frequency distribution graphs: 
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The data shown on both of these graphs have the same 
mean, but the data on Graph B is more yariable than the 
data on the other graph. 


Notice that the difference between the highest and lowest 
observed values on Graph A equals minus 


The difference between the highest and lowest value on 


Graph B equals minus 


Since the range of a collection of data is the difference 
between the highest and lowest values, the range of the 
data on Graph A is and on Graph B, ; 
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19. 


20. 


21. 


22. 


Notice that the distribution in which the values are spread 


farthest from the mean has the largest range. In other 
words, the distribution which is most v 
would tend to have the largest range. 


Suppose the largest value in a collection of data were 100 
and the smallest value were 10. The range of that 
collection of data would be , Since this equals 


minus 10. 


Instead of considering frequency distribution graphs, let's 


consider how a graph of raw data indicates the range. 
The range of a collection of data is immediately obvious 
in a graph of raw data, since the range is the 

in height of the column representing the smallest 
observed value and the height of the column representing 
the largest observed value. 


The variability of a collection of data does not depend on 
the size of the values in the collection. It only depends 


on the differences in the sizes of the values. For example, 


consider the two graphs of raw data shown below. (Do 
not confuse these graphs with frequency graphs.) 
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Notice that the values on Graph are all larger than 


A7 B 
those on the other graph. However, the variability of 


the data on Graph is greater than the variability 
A/B 


of the other collection of data. ino 
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23. 


24. 


25. 


The range of the data in Graph would be found 
A/B 


by subtracting the value of observation 3 from the value 
of observation 4. 


The range of a collection of data can also be determined 
from a frequency table by finding the difference between 
the smallest value having a frequency greater than 0 and 
the largest value having a frequency greater than 0. In 
the following frequency table, for example, the largest 
value having a frequency greater than 0 is — and the 


smallest value having a frequency greater than 0 is . 


Therefore, the range of this distribution is . The 


value 13 was a possible value which 


was/ was not 
observed. This is why we only consider values with 


frequencies greater than zero. 


10 


11 
12 
13 


The range of the distribution shown in the following 
frequency distribution graph is ,Since the smallest 
observed value is and the largest observed value 


is 


FREQUENCY 


VALUE 
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26. The range can sometimes give a misleading picture 
of the variability of a distribution. For example, 


consider the following two frequency distribution graphs. 


GRAPH A 
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Notice that the range of distribution in Graph A is 
and the range of distribution in Graph B is 


Notice that all the values except one are identical in 


Graph whereas the values in the other graph were 
A/B 


all different. Therefore, even though both distributions 


have the same range, the distribution in Graph 
A/B 


appears to be more variable than does the other 


distribution. 
The problem in representing the variability of a 


distribution with the range is that it is determined solely 


by the largest and smallest observed values. 
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27. 


28. 


29. 


Earlier, we pointed out how the variability of a 
collection of data is related to the deviations of the 
values from their mean. For example, consider the two 
graphs of raw data shown below: 
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We have shown deviations from the mean value on both 
of these graphs. Thus, the mean of both distributions 
is since it equals 10 on Graph A and 


the same/ different 


10 on Graph B. 


While we know that the sum of the deviations from 10 
equals in both distributions, there is, nevertheless, 


an important difference between the two distributions. 


It is apparent that the deviations from 10 tend to be 


larger for the data shown on Graph than for the 
A/B 


data in the other graph. 


You could describe the differences between the two 
distributions by saying that the value of each observation 


tended to vary or change more on Graph than on 
A/B 


the other graph. 


163 


the same 


918 


001 * —  - əzis jo səldures woaz sueəur o[dures zog (ude13 
UI0}}0q) jeet əm) pue ‘GZ ezrs Jo sejdures uro1j SULJU 

ejdures qoy 1oqjoue “ç ozIs yo so[dures (OI Spaut o[dures 

IO} I1egjoue “Z əzts yo so[dures soy ude18 euo əpn[our 


syde13 uorjnqrijstp Aouonbaij INOJ asay} JEY} IION 


SI P Ing Sha sO 1.0 T2268. 425948 


o 
t 
XONMnoz3L 


001 9ztg Jo sejdureg 


ürk pic QE obs Ol 585 8 212597 50 


XONMnozuu 


at sərdure, 
GZ ƏəZIS JO sər S oot 


ST PL Shel Ue Oca 6s ac 


° 
N 
AONANOTUA 


G əzis Jo sə[dures 00T 


(bənunuo2) 


"Les 


30. 


31. 


32. 


Data containing many widely separated values of a 
variable is said to have more variability than data 
containing very similar values. Thus, of the two 


previous distributions, the Distribution in Graph 
A/B 


c ould be described as having the most variability. 


In other words, we could say that the two distributions 
shown in Frame 27 are similar in terms of their 


and different in terms of 
central tendencies/ variability 


their ° 


central tendencies/ variability 


The relationship between the variability of a collection 
of data and the size of the deviations from the mean is 
illustrated by the following two tables. 


15 
8 
12 
5 


10 
11 
10 


TABLE B 


Notice that the deviation of observation 4 from the mean 
in Table A vas obtained by subtracting İrom d 


to yield an answer of 
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central tendencies 


variability 
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33. 


34. 


35. 


36. 


37. 


Notice that the sum of the deviations in Table A equals 


and the sum of the deviations in Table B equals 0 


This indicates that the of both 0, mean 
distributions is 10. 


It is clear that the values tend to be farther away from 


the mean in Table than they do in the other table. A 
A/B 


VVithout considering the deviations from the mean, it is 
apparent that the value of the observations changed or 


varied more in Table than they did in the other A 
A/B 
table. Therefore, the of the data variability 


in Table A appears to be greater than the variability of 
the data in the other table. 


We could represent the difference in variability by 
calculating the range of each collection of data. The 


range of the data in Table A would be , Since this 10 
equals minus 15, 5 
Since the range of the data in Table B is , the 2 
range of the data in Table A is than the greater 


greater/ smaller 


range of the data in Table B. 


The range is not the only statistic you can use to represent 
the variability of a distribution. There is another way 

of characterizing the difference in variability of the 

two previous collections of data. Notice that if we ignore 


whether a deviation is positive or negative, the 


deviations in Table tend to be larger than those in A 
A/B 


the other table. 
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(Continued) 


You can think of the variability of a collection of data 

as the degree to which the values are Spread out from the 
mean. In other words, if all of the values in a collection 
of data are very similar, they will all be clustered very 
close to the mean and the data will not have much 
variability. H the values in a collection of data are 
widely dispersed (spread out), the deviations from the 
mean will tend to be very and the 


large/ small 


distribution could be described as having a great deal 
of A 


We have seen several illustrations in which two collections 
of data have the same mean but different variability. It 

is also possible for two collections of data to have 
different means and the same variability. For example, 
consider the two graphs shown below. Notice that the 


mean of the data in Graph is smaller than the 
A/B 
mean of the other collection of data. 
40 404 < -X 
30 30 
20 20 
101- = -X 10 
Une Onion b 
OBSERVATION OBSERVATION 
GRAPH A GRAPH B 


However, the variability of the two collections of data 


(shown above) about their respective means - 
is/is not 


almost identical. Thus, the two distributions could be 


described as having similar but 
variability/ means 
different . 
variability/ means 166 
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40. 


41. 


42. 


Therefore, simply knowing that two collections of 
data are similar in terms of their central tendencies 


indicate whether they are similar in terms 
does/ does not 


of their variability. 


Another illustration of the lack of any relationship 
between the mean of a distribution and its variability is 
given by the following two tables of raw data: 


TABLE A 


1 10 
2 8 
3 6 


TABLE B 


Notice that the data in each table consists of 


observations of a variable. 


numerical/ non-numerical 


The particular reference value from vhich the 
deviations are calculated in each of the previous tables 
is the of that collection of data, since the 
sum of the deviations in each table equals . Thus, 
the mean of the data in Table A is ), whereas the 


mean of the data in the Table B is 
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44. 


45. 


46. 


We could represent the variability of each collection 
of data by its range. The range of the data in Table A 
is and the range of the data in Table B is j 


Earlier, we pointed out how the variability of a 
collection of data could be thought of as the degree to 
which the observed values were spread out or dispersed 
about the mean of that collection of data. The 

is a useful measure of variability because it is the 
difference between the value having greatest positive 
deviation from the mean and the value having greatest 


negative deviation from the mean. 


We also noted earlier that the range is not à completely 
satisfactory way of representing variability. Two 
distributions may have identical ranges and yet one 
distribution may appear to be much more variable than 
the other. For example, consider the two collections 


of data shown in the following tables: 


15 
5 
12 
7 
8 


DATA # À DATA # B 
The smallest value in each table is and the 
largest value is . Notice that the range of both 


collections of data equals ¿ 


While the range is the same in both collections of data, 


all the values, except one, are identical in Data e 
: A/B 
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48. 


49. 


This illustration points up one of the disadvantages of 
the range as a vay of representing variability. The 
range is determined by only two of the observed values 
in the collection of data: 


1) the observed value, and 
2) the observed value. 


Since the largest observed value in each table is 
and the smallest observed value is , the collections 
of data shown in the following two tables have 


ranges. 


the same/ different 
1 


2 
3 
4 
5 
6 1 


TABLE A 


-4 


1 
1 -4 
1 -4 
8 
9 

10 


TABLE B 


We have listed the deviations of the values in each table 
from their common mean of . The largest and 
smallest values in both collections of data are the same. 


However, the other observed values in Table are all 
A/B 


very close (or identical) to the mean. All the values in 
the other distribution are almost as far away from the 


mean as are the two extreme values. 10 


largest 


smallest 


10 
1 


the same 
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50. 


51. 


Even though the two extreme deviations in both 
collections of data are the same, the typical size of a 


deviation in Data is greater than in the other 
A/B 


collection of data. This feature can be illustrated by the 
following graphs of raw data: 


10 

al DATAA 
B 1 

6 
= GE " 
zd X 
p 

2 

1 

0 

1 2 3 4 5 6 
OBSERVATION 


VALUE 


OBSERVATION 


We have indicated the deviations from the mean in both 
collections of data with arrows, just as we have done 
previously. The lengths of these arrows indicate how 


most of the observed values in Data are very 
A/B 


close to the mean, whereas the values in the other graph 
tend to be farther away from the mean. 


We have already seen that the mean of a collection of 
values can be thought of as the typical value. Therefore, 
one way we might represent the typical size of a deviation 
would be to find the mean, or the average of the deviations. 


However, our formula for finding the mean of a collection 
of values (), X/ N) says that the first thing to do is to add 
together all the values. No matter what the variability of 
a distribution, the sum of the deviations from the mean 


would always equal e 
ways equal _ . 170 
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53. 


54. 


Statisticians have found it useful to represent the 
variability of a distribution by the typical or average 
size of the squared deviations from the mean. Whenever 
you square a number, your answer will be positive, 
whether the number you are squaring is positive or 
negative. 


For example, 2? =I 22 4 


Similarly, (-2)? = (202-2 |. 


Therefore, whether a deviation is positive or negative, 


when you square it, your answer will be . 
positive/ negative 
In the table of data shown below, we have listed 3 
observed values, their deviations from the mean, and 

the square of these deviations. Thus, Xx, has a value of 
_ and a deviation from the mean of "Ehe 
square of that deviation is times , which 


equals 


If you add (sum) all 3 observed values, you obtain the 


total of the observed values, which is . Dividing 
this total by indicates that the mean of these three 
values is 


Adding up the deviations from the mean value of 5 will 
naturally yield an answer of , because the mean is 
defined as that particular reference value from which 


the deviations sum to 
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Observation Deviation from 5 | Squared Deviation 


1 8 
2 -2 
3 5 


positive 
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55. 


56. 


57. 


58. 


To find the mean of a group of values, you first find 
their total, and then divide this total by the number of 
values. H you wanted to find the mean (average) 
squared deviation, you could first add all the squared 
deviations and then divide by the number of squared 
deviations. In our previous example (see the table in 
Frame 53), the total of the number of squared deviations 


equals plus plus , Which equals 


Since there were 3 observations in the collection of data, 
you should divide the total of the squared deviations by 
to find the mean of these squared deviations. 


Thus, the mean (average) of the squared deviations 
equals divided by , Which yields the 


ansver of 


Statisticians refer to the average of the squared 
deviations as the variance. Therefore, the variance of 


the previous collection of data is 


The variance is a statistic representing the variability of 
a collection of data. Values that are widely dispersed 
(spread out from the mean) have large deviations from 
the mean. Whether these deviations are positive or 
negative, they will result in large squared deviations. 
Therefore, saying that a collection has a large variance 


implies that the values tend to be 
spread out from/ close to 


the mean. 


A collection of data having observations of all the same 


value has the possible variability. 


most/ least 
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60. 


61. 


62. 


Suppose all the observations in a collection of data 
had the value 10. The mean of that collection of data 
would equal and each observation would have a 
deviation from the mean equal to 


If each deviation equaled zero, each deviation squared 
would also equalzero. Since the variance of a collection 
of data is the typical size of the squared deviations, the 
variance of this collection of data would equal 


A collection of data having the least possible variability 
would have, therefore, a variance equal to EI 


would also have a range equal to D 


Consider the collection of data shown below: 


The mean of this collection of data is 6. We have left 
room in the third column of the table to insert the deviation 
of each observation from the mean. The deviation of 
observation 1 from the mean equals minus 5 


which equals 


Squaring the deviation of observation 1 from the mean 


equals times , which equals 
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63. 


64. 


65. 


66. 


Deviation From 6 | Sq. of the Deviation 


Observation Deviation from 10 İ Sq. of Deviation 


We could find the deviation and the square of that 
deviation in a similiar manner for each of the other 
observed values and summarize our vork in the folloving 
table: 


2 4 
ET 1 
ET 1 

0 0 


To find the average of the squared deviations (the 
variance), we would all of the squared 


deviations and divide this total by e 


Therefore, the variance of the previous collection of 
data equals divided by . Thus, 1. 5 is the 
of the previous collection of data. 


The variance of the data in the folloving table equals 
divided by . The variance, therefore, 


15 
5 
5 

15 


Since the variance is the of the squared 
deviations from the mean, the formula for the variance 


will be similar in some ways to the formula for the mean 


we considered earlier. 


To find the mean of a group of values, we first sum all 
the values and then divide this sum by the 


of values. 
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67. 


68. 


69. 


70. 


(Continued) 


In a similar manner, to find the mean of a group of 
squared deviations, we first sum all the squared 
deviations and then divide this sum by the number of 
squared deviations. In other words, the total of all the 
squared deviations divided by the number of deviations 
equals the mean of the squared deviations, which 
particular mean ve call the 


Representing the mean of a group of values by x and the 
value of a particular observation by X, we would 
represent the deviation of that observation from the 
meanas(X- ). The expression (X - 9? would 
represent the square of the previous deviation. 


If your collection of data consisted of three observations, 
you could represent the of these three 


observations as: 


X, + KX, +X 


1 2 3 


Similarly, you could represent the sum of the 
of these three observations from their mean as: 


(K,- x) + @, - 2) + (x, - x) 


Finally, you could represent the sum of the 


deviations as: 


(x, - x). G - 92 Da: d 


Just as we represented the sum of all the raw scores by 
> X, we could represent the of the deviations 


from the mean by ex - x). 
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deviations 


squared 


sum 
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Regardless of vhat collection of data we are describing, 
however, we know that x - x) will always equal 


,according to our definition of the mean. 


Using the symbol (the capital Greek letter ) 
in the same way as we did when we wrote » X and 

Her - x), we could represent the total of all the squared 
deviations by — — (X- 37. 


Since the variance is the mean of the squared deviations, 
we could represent the of a collection 


of data as: 


ye - gi 


N 


Statisticians use the symbol g^ to represent the variance. 


Using this symbol, a formula for the variance could be 


written as follows: 2 
LX -3 
N 


The symbol c is the uncapitalized form of the Greek 
letter sigma. The summation symbol was the 
capital Greek letter sigma. For this reason, the 
variance, represented by c 2 is often referred to as 


S ORAT sə squared. 
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Tt. 


We now have defined two formulas. The first formula 
is: 

BE OM 

x = “r” 


where x (called the ) is a statistic 


representing the c t 


of a distribution. 


The second formula is: 


o us eost 
N 


» 


where ” (called the v ) is a statistic 


representing the of a distribution. 


Let" s try using the formula for both the mean and the 
variance on the collection of data shovn in the folloving 
table. 


1 
2 
3 


Notice that we have added two extra columns to a table 
of raw data. In the third column of the table, we could 
list the of each observation 


from the mean. In the fourth column we could list the 
of each deviation from the mean. 


Before you can calculate the deviation of each value from 
the mean, you must first calculate the mean itself. 

Since the formula for the mean is , you must 
first find the of all three values and then 


divide this result by (since N = 3). 
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mean 


central tendeney 


variance 


variability 


deviation 


square 


əx 


N 


sum (total) 
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78. The sum of the observed values is . Dividing 
this sum by 3, you find the mean to be 


79. Accordingly, you could replace the headings on columns 
3 and 4 in the previous table as follovs: 


1 
2 
3 


Note that here we have replaced found in the 


earlier table with its actual value, n 


Notice that since the mean is 5, the deviation of 
observation 1 from the mean equals , and the 
square of this deviation equals e 


The deviation of observation 2 equals , and the 
square of this deviation equals 


Finally, the deviation of observation 3 equals 5 
and the square of this deviation equals D 


80. We have summarized these answers in the following 


table: 


2 


1 2 1 
2 
3 
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(Continued) 


Remember, the formula for the variance, 


ac» 
N , 


says to sum all of the squared deviations and then to 

this sum by the number of observations. divide 
The variance of the data in the previous table, therefore, 
equals y divided by . This means on EM 2, 3, 5 
for this collection of three observed values. 


Suppose the deviation from the mean of every value in a 
collection of data equaled 2 or -2. The square of each 
of the positive deviations would equal 2 times 2, or 4, 
and the square of every negative deviation vvould equal 


times , which would also equal 4. “2, -2 


Since the square of every deviation from the mean would 
be 4, the typical or mean deviation would equal 4. We 
would, therefore, say that the variance of the distribution 


was 


Find the error in the following table. 


Observation 


Notice that you would correct this error by changing 
-5, 45 (or simply 5) 


+5 


to , since the deviation of 10 from the 


reference value 5 is 
+5 / -5 
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83. 


84. 


85. 


86. 


87. 


Since the sum of the squared deviations equals 

and since there are 3 observations, the variance of 
the previous data would equal — devided by . 
Therefore, e = 5 BE 


H your data consisted of the values 3 and 7, the mean 
would be - The variance, therefore, would 
equal divided by - In other words, 


Since x = A, the mean of 3 and 7 would equal = 7 


E Ze, s” 
Similiarly, if the mean equals 5, would equal 


2 2 N 
3-5) + 7 - 5) , which equals 
əya a get 2 

2 mig Oat RES 


Since the variance can be thought of as the typical size 

of a squared deviation, statisticians have found it useful 
to assign a special name to the square root of the 
variance. They call the square root of the variance the 
standard deviation. Therefore, the standard deviation 

of a distribution is a deviation which, if squared, would 
equal the (even if none of the observed 


values actually has this particular deviation). 
If the variance of your data were 9, an observation which 


had a positive deviation from the mean of would 


have a squared deviation equal to the variance. 


180 


t eo 
ho 


variance 


662 


(m 3səreəu au o) JJO pəpunor ərəm suoryriodoad 
epdures ey} “rəqurəurəy) -:uornodoad uoneyndod 
əm) o) reənuəpr uorjodo.rd ərdures € savy JOU prp yey} 


G 001 Əzts yo səldures ATuo o1oA əsə) ORT ul 


"E: uey} exour Aq uoryrodo.ıd uonşeyndod ay} wo. pa19jjtp 
uolj10do1d opdures əy} qorqA ur 001 9zts jo Səqdures 
0 9194 ərəuljəours “0071 ezrs jo sojdures wou 


pəure)qo Anuəredde are sojeurnso IPINI odour UƏAZ 


[4 * —  perenbe d yoa ur səfdures 
G ~ əm pue ‘g: po[enbo d Yyorya ur sop[dures p əm “6: 
pərenbə d qorqa ur epdures ouo əy) ƏzəA Z: em oxour 

Aq uory1odo.id orrjeureaed əy} mot pa1ogrp uorjrodoad 

epdures ou qorqA ut OT Əzts yo sopdures ATuo au], 


*eS1v[ OS IOIIA UL 0} Dat JAVY prnom 

OL Yeu} OT Əzts Jo səjdwes ` quo oxoA əsən) “` 
UEY} 1977913 uoryeumsə Jo 10119 ejn[osqe ue 0} Par 

PINOM Jey} ç əzts yo sə[dures yz 910M əsən) ATM 


I ° Jo uoppodoad ər[dures e pey SurAsu (s)ərdures 
I ^ — əm pue ‘g jo uorjrodoad əjdurwes e Suraeu sojdures 
TT am “Z: JO uorj10doad ojdures e SurAeu so[dures DI 34} 

“0 Jo uorşrodoud ə[dures e SutAeq səldures om} au yo dn 

əpeur əre g: ucu) zəycər3 uoneumsə JO 10119 ojn[osqe 

Ue 0) Ped] peu pinoa Fey} ç əzis jo sojdures pz ou], 


Lë: uey} əzour Aq 

$* JO uorj1odord uonerndod əy} mot əy €: vey) Sse[ 

IO 2” Wey} exour 0) Tenbə d yo senjea Dro ‘requiaurey) 

“€” UEY} 1972913 10119 uo-jeumse ojnosqe ue oi peor 

vc PINOM şen) G 9ZIS yo səqdures ^ oi1oA OO, 


"00€ 


“661 


"861 


"Let 


88. 


89. 


90. 
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In other words, if the variance of your data were 9, 


the standard deviation would be , Since (3? = 9. 


H the variance of your data were 25, the standard 
deviation would equal , Since ( e = 2B) 


. Just as a group of values may have a mean that is not 


equal to any of the values, it is not necessary for any 
particular value in a collection of data to have a 
deviation from the mean exactly equal to a standard 
deviation. Therefore, whereas the standard deviation 
can be thought of as the typical size of a deviation in 


same collection of data, it necessary for any 
is/ is not 


value in the data to actually have this deviation. 


To summarize, the variance represented by the symbol 
is the typical or average squared deviation. 

Any squared deviation equal to this average squared 

deviation would be called the 

Remember, it necessary that any value in 


is/ is not 


the data actually have this particular deviation. 


The variance is equal to the square of the standard 
deviation. Therefore, the standard deviation is equal to 
the of the variance. 


Since the variance is represented by e. the ib 
deviation could be represented by gə, But c is 
simply o. Thus, the variance is represented by 
o/o 
d the standard deviation by e 
anı 7 


0/0 
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By using d^ to represent the and o to 
represent the ; 
we emphasize the fact that the variance is simply the 
of the standard deviation. In other 
words, we emphasize that the 
of the variance equals the standard deviation by letting 
represent the standard deviation and 


represent the variance. 


Suppose all the values in a collection of data deviated 
from the mean by either +3 or -3. The variance of that 
collection of data would equal and the standard 


deviation would equal 


In other words, zə equals and o equals . 
9/3 9/3 


We have considered three statistics used to represent 
the central tendency of a collection of data. These three 
statistics are the , the P 
and the g 


The is the most frequently occurring value 
in a collection of data. The is a value 
which would divide a list of the ranked data into two 
equal parts. The is that particular 
reference value from which the sum of the deviations 


equals 0. 


We have also considered three statistics used to represent 


the variability or dispersion of a collection of data. They 


are theu 0 N y sn and its 


square root, called the 


182 


variance 


standard deviation 


square 


square root 


9, o" 


mean, median 


mode 


mode 


median 


mean 


range, variance 
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100. 


The is the difference between the 
largest and smallest observed value. The 
is the typical or average of the squared deviations from 


the mean of that collection of data. The standard 
deviation is a deviation which when squared will equal 
the 5 


We have also found a way to write rules for calculating 
the mean and the variance. To find the typical value or 
mean, ve all the values and 

by the number of values. The formula for the mean is 


written, therefore, as: 


Sp 


To find the typical squared deviation (the variance), we 
sum all of the and 
divide by the number of values. The formula for the 


variance, therefore, is written as: 


2 
osas 


The standard deviation is simply the 
of this variance. 


The statistics which describe the 
of the distribution represent in one 


way or another the typical value to be found in that 
distribution. A statistic representing the 

of the distribution describes the degree to which the 
observed values are spread out or dispersed. 
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range 
variance 


variance 


add (sum), divide 


squared deviations 


ox - x)? 
N 


square root 
central 
tendency 


variability 
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101, 


Statistics describing central tendency do not tell you 
anything about the variability of a distribution. 
Statistics describing variability do not tell you anything 
about centraltendency. For example, consider the four 


distributions shown in the following graphs. 
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m 
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100 DISTRIBUTION C 


FREQUENCY 
e 
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0123456789 10112131415 
VALUE 


100 DISTRIBUTION D 


FREQUENCY 
e 
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0-T2345078910112131415 
VALUE 


Notice that distributions A and B are similar in terms of 


İheir tes 
central tendency/ variability 


terms of their 


yet quite different in 


central tendency/ variability 
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variability 
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102. On the other hand, Distribution A is similar to 


Distribution C in terms of 5 variability 
central tendency/ variability 


whereas the mean of Distribution A appears to be 
the mean of Distribution C. larger than 


larger than/ equal to 


103. Distribution B and D are similar in terms of 
` variability 


central tendeney/ variability 


104. The choice of one statistic over another to represent 
same feature of a distribution depends upon the particular 
feature you wish to represent. For example, consider 
the distribution shown below. Except for a few extreme 
(unusually large) values, most of the values are grouped 


around the value D 2 
2/5 


40 
30 
20 
10 


FREQUENCY 


0123456789112 
VALUE 


If you represented the central tendency of the preceding 

data by the mean, the extreme values would tend to pull 

the mean away from the value 2. On the other hand, if 

you used the mode to represent the central tendency of 

the distribution, its value would be ` `, In other 2 
words, the mode might be a better way of representing 

the central tendency in terms of this unusual distribution 


since the mode influenced by the few is not 
is / is not 


extreme values found in the distribution. 
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REVIEW II 
FILL IN THE BLANKS: 


1. The most frequently occurring value in the distribution 
is called the S mode 


2. NUMBER OF ERRORS 


3 
6 
2 
3 


In the table above, pick out the modal number of 3 
errors. 
3. A value in a collection of data which is smaller than half 
of the other observed values and larger than the 
remaining values is called the . median 
4. Find the median in the following collection of data: 
14, 11, 8, 4, 2 8 
5. Find the median in the following set of numbers: 
9,78, 6.4 T 
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MULTIPLE CHOICE: 


6. It is possible for two collections of data to have 
different means and: 


a. radial variability. 
b. diametric variability. 
c. the same variability. 


d. none oí the above 


iri The degree to which the observed values are spread out 
or are dispersed from the mean of a collection of 


data can be referred to as the: 
a. mean. 
b. variability of that collection of data. 
c. median. 


d. none of the above 


The average of the squared deviations from the mean 
is called the: 

a. standard deviation. 

b. variance. 


c. mean. 


d. none of the above 


The square root of the variance is called the: 
a. mean. 
b. median. 
c. standard deviation. 


d. none of the above 
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10. H the variance of your data vere 36, the standard 
deviation would be: 


a. "9. 
pera 
ce. A: 


d. none of the above 


11: The symbol representing the standard deviation is: 


d. none of the above 


TRUE OR FALSE: 


12. 

a negative deviation from the reference value. 

13. Let us assume that the reference value is 8 and the 
observation is 6. In this case, the deviation is 2. 

14. H ve add all the deviations of a group of observed values 
from a particular reference value, and the answer is 
zero, that reference value is the mean of those values. 

15. The difference between the largest and smallest value 


in a collection of data is called the median. 
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16. 


TT: 


H your data consisted of the values 73, 22, 14, 91, 
and 11, the range would be 47. 


It is possible for two collections of data to have the 
Same mean, but different variability. 
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Section V: Types of Distributions 


In addition to describing the central tendency and 
variability of a distribution, it is often useful to say 
something else about the general shape of the distribution. 
For example, consider first the distribution shown in 
Figure A below. Now look at Figure Aİ m Figure A, 

we have shown how half of Figure A would look reflected 


in a mirror. 


MIRROR 
20 20 
Š Š 
á a 
e 10 0 
g zül 
5 E 
0 0 
VALUE 
FIGURE A FIGURE A! 


Notice how the reflection of the left half of the 
distribution has exactly the same shape as the right 
half of the distribution. (The right half is behind the 
mirror in Figure Al .) In other words, if we cut the 
distribution shown in Figure A in half along the line 
where we placed the mirror in Figure Aİ, we would 
have divided the distribution into two parts with the 


same shape but facing in opposite directions. 
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(Continued) 


In comparison, consider the distribution shown in 
Figure B below. In Figure Bİ the reflection in the 


mirror have the same shape as that 
does/ does not 


part of the distribution behind the mirror. 


MIRROR 


20 20 


FREQUENCY 
m 
o 
FREQUENCY 
= 
e 


VALUE VALUE 
FIGURE B FIGURE Bİ 


Therefore, of the previous distributions A and B, 


Distribution can be divided in half so that the 
A/B 


right half looks like a" mirror image" of the left half, 
whereas it would not be possible to divide the other 


distribution in such a manner. 
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A distribution which can be divided so that its left 
half appears to be a mirror image of its right half is 
called a symmetrical distribution. If a distribution 
cannot be so divided, it is called an asymmetrical 
distribution. Thus, of the two distributions shown 


below, Distribution is symmetrical, whereas the 
A/B 


other distribution is 


20 20 
p p 
o 
2 
° 10 2 10 
El El 
m m 
= = 

0 0 

VALUE VALUE 
DISTRIBUTION A DISTRIBUTION B 


Distribution A would be called symmetrical since you 
could place a mirror so that the reflection in the mirror 
had exactly the same shape as that part of the distribution 
hidden behind the mirror. (See the illustration below.) 
In other words, the right half of the distribution is a 
"mirror image" of the left half of the distribution. 
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On the other hand, the reflection of the left half of 


Distribution B (Distribution Bİ) have the 
does/ does not 


same shape as that part of the distribution hidden 
behind the mirror. We would say, therefore, that 
Distribution B was 


symmetrical/ asymmetrical 
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Of the four distributions shown below, Distributions 
and would be described as symmetrical, 
while the other two distributions would be described as 


asymmetrical. 

> 

2 3 

E] m 

Ë E 

j VALUE VALUE 
DISTRIBUTION A DISTRIBUTION B 

u š ipe 

e 

B B 

= [c] 

Sg Ë 

m x 

a VALUE S VALUE 
DISTRIBUTION C DISTRIBUTION D 


193 


does not 


asymmetrical 


98€ 


-epdurexe snorAədd ay} ur oseo oy} SEA Se 5. 

g Jo peejsut sSTenbə mou d ‘snyy, e: ÁA[joexe Spa 
sndureə SEƏS.TƏAO ay} JO OAR] ut sjuopnjs jo uory1odoad 

uoyerndod ən.n əy} yey} əsoddns s,1ə[ 'uorjeajsngrt 

juəsərd ay} 10d  AHəexə sem sndured SEƏSİTƏAO ƏY} 

JO IOA%J UT ƏIƏA oym S1uəpn1s 000 ‘OT 24} yo uorj1odoad 


əm) yey} puno] Sem y 'uorjye1jsn[[r snorAƏ:d ay} ul 


"ərdurexə SurMO[[OJ əv) IƏPISUOI S, 191 'uornqrrjsrp 
Sur[dures əy} jo ÁAjt[tqeravA ayy pu? opdures əy} yo 


ƏZIS əv) uəəmləq drqsuor[o1 ou) ojedjsni[t 03 IƏPIO UJ 


*uornqrrjsrp Surpdures 


ostoloop /əseərəur 


əsvəsəəp dy} Jo oouerreA au 03 puə1 DOE 
“ərdures ay} jo ezts əv) eseadour noÁ se fSpioA Iəulo 

ul :e[ydures o) o[dures uro1j Lea 0} pU} TİTA 913S13828 

ərdures au sse[ ey} “əqdures əy} yo əzts ƏY} 193.1e[ 

ay} “Tesəuə3 ul :səldures oy} jo ƏZT8 ay} pue uomnqınsip 

Sur[dures oy} jo ATTQeTIVA əv) uəəmjəq drqsuoneyər 

əm) ST suorinqrajsrp Surpdures jo o3n3e97 şüeşrodur uy 


Tt91391081 /Tequeurr1odxo 
Te1uourt1odxo *uornqrajsrp Surpdures (u)e perreo 


eq prno^ po3oe[[oo ÁA[renjoe peu nod səfdures jo dnoz8 e 
uo pəseq uornqrujsrp Surpdures e svarayM ^uornqrnstp 


T?91381081] /Te3ueurr1odxo 
Teənəoəu) Sur[dures (u)e pereo 


əq p[noA “ərojərəul 'suorjereprsuoo [eorSo[ uo Á[opos 
pəseq uonnqrujstp Sur[dures y - suorjnqtujsrp Suridures 
Téjueurpredxe se o) pƏIIƏJƏI ag are Leon sent 
Surpeoai1d əu ur pəzəpisuoə ƏM souo ay} se uəns) sərdures 
Jo uorjoeppoo penjo* ue uo peseq suorjnqrujsrp Surpdureg 


“GPT 


"PPI 


"ERT 


In some distributions, the scores seem to be piled 
up towards one end of the distribution. For example, 


in Distribution (shown below) most of the observed A 
A/B 


values occurred near the low-valued end of the 
distribution, with only a few values occurring near the 


upper end of the distribution. 
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The scores in Distribution B tend to be piled up near 


the values, with only a few observed values high 
high/ low 
down near the low end of the distribution. 


Both of the distributions shown above would be referred 
to as distributions. asymmetrical 


asymmetrical/ symmetrical 


Asymmetrical distributions in which most of the scores 
are piled up near one end could be regarded as 


"lopsided ." 
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(Continued) 


Distributions of this sort are said to be skewed. Ifa 


skewed distribution has most of the observations piled up 


near its low values, the distribution is said to be 
positively skewed. If the distribution has 

most of the observations piled up near the high values, 
the distribution is said to be negatively skewed. Thus, 
of the three distributions shown below, Distribution _ ` 
is positively skewed, whereas Distribution — is 
negatively skewed. Distribution _, however, is 
neither positively nor negatively skewed, since this 


distribution is symmetrical. 
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10. 


Remember that a distribution is 
Skewed/ symmetrical 


regarded as "lopsided." A 
skewed/ symmetrical 


distribution, however, could be regarded as perfectly 
balanced at its center, so balanced that its left half is a 
"mirror image' of its right half. 


We can think of any collection of data as a record of the 
observed of a variable. 


Any collection of data can be described in terms of the 
various frequencies with which the different values occur 
inthe data. This group of frequencies is referred to as 
the of the data. 


Often the difference between two distributions can be 
made apparent by drawing a picture of the distribution in 
the form of a frequency graph. Graphs of this sort make 
the "shape" of the distribution apparent. 


For example, we could describe Distribution (shown 
A/B 
below) as a symmetrical distribution and Distribution 
asa skevved distribution. 
A/B positively/negatively 
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11. 


12. 


13. 


You would describe Distribution À (above) as symmetrical 


since its left half is simply the reverse of its right 
half. In other words, the left half of Distribution A 
mirrors its right half. Distribution B, however, is not 
symmetrical. Distribution B is "lopsided," with most 
of the observed values piled up near one end. Lopsided 
or skewed distributions are described as 

positively skewed when the piling up occurs near the 


-valued end of the distribution, and negatively 


high/low 


skewed when the piling up occurs near the z 
high/ low 


valued end. 
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VALUE 


Most of the observed values in the above graph are 


"piled up" near the of the distribution, with 
ends/ center 
very few observed values near the of the 


ends/ center 


distribution. 


The previous distribution, however, 


would/ would not 


be described as symmetrical, since its left side 
mirror its right side. 


does/ does not 
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15. 


16. 


As a psychologist, you will encounter distributions with 
many different shapes. You willfind, however, that 
certain types of distributions are encountered more 
often than others. For example, a very common type 
of distribution is one in which most of the values are 
piled up near the mean, with fewer and fewer values 


occurring farther from the mean. 


In other words, values similar to the mean would have 


the frequencies, whereas values larger 
larger/smaller 
farther away from the mean would have smaller 
larger/ smaller 
frequencies. 
Distribution (below) would be an example of the A 
A/B 
common type of distribution we just described. 
> > 
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Ea fa 
VALUE VALUE 
DISTRIBUTION A DISTRIBUTION B 


This commonly encountered type of distribution is often 
described as "bell-shaped" because its shape is similar 


to a bell's. 
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Although the shape of the previous distribution is not 
identical to the shape of a bell, it is convenient to 
describe this kind of distribution as approximately "bell- 
shaped." You could describe the difference between the 


two distributions below by saying that Distribution A 
A/B 

is approximately "bell-shaped, " whereas Distribution 

is not. B 
A/B 
> 
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Let "s consider an example of a "bell-shaped" distribution. 
Suppose you filled a glass iar vith 20 marbles and then 
asked a large number of students to estimate how many 
marbles there vere in the jar. Some of the estimates 
would be too high and others would be too low. You would 
expect, however, that most of the estimates would be 
fairly close to the actual number of marbles in the jar. 
Occasionally, you would obtain some poor estimates, such 


as 15 or 25. On the other hand, you would expect 
estimates near to be more frequent twenty 


twenty/ twenty-five 
than estimates near R 
twenty/ twenty -five 


twenty-five 
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19. You would probably find the distribution of these 
estimates similar to which of the distributions shown 


below? 
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20. Both of the above graphs would be called relative 
absolute/ relative 
frequency distributions, since they show the p proportion 


of subjects who estimated each of the possible values. 


21. If Distribution A (above) had been the distribution of your 
data, the subjects would have been acting very strangely . 
According to Distribution A, many of the subjects over- 
estimated and many of the subjects under-estimated but 
very of the subjects made estimates close few 

to the true number of marbles in the jar. (Remember, 


there vere 20 marbles in the jar.) 


22. According to the Distribution B, (above) none of the 
23 


17 


subjects ' estimates was greater than 


or less than 
200 
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26. 


You could describe the variability of the estimates in B 
by saying that "the range of the estimates vas oH 


In other words, the range equals minus 


The proportion of students who estimated that there were 
more than 21 marbles in the jar is simply the proportion 
who estimated that there vere 22 or more. H à of the 
students estimated 22 marbles, à of the students 
estimated 22 marbles, and none of the students estimated 
more than 23 marbles, you could say `  ofthe 
students estimated there were more than 21 marbles in 


the jar. 


The proportion of students who estimated more than 20 
marbles is simply the proportion of students who 
estimated there were 21 marbles, plus the proportion 
who estimated there were  marbles, plus the 


proportion who estimated there were marbles. 


Imagine that the distribution of estimates was as follows: 
(Notice that we have shaded the columns representing 


subjects who estimated more than marbles.) 
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28. 


29. 


30. 


We pointed out earlier that the sum of all the proportions 
in a proportional (relative) frequency distribution has to 
equal `  . Since the height of each column represents 
the proportion of students who estimated a particular 
value, the total height of all the columns added together 


must equal 


According to the preceding graph, is of the students 
estimated 18, m estimated 19, estimated 20, 


estimated 21, whereas the remaining İs of the students 


estimated 


Thus, the total of the proportions represented by all the 


columns is / 16, vhich equals 


If we added together the heights of the two shaded 
columns in the previous distribution, they would form a 
column whose height was equal to the proportion of 
students who estimated more than ^ ^ marbles. 
Notice that the proportion of students estimating more 


5 
than 20 equals plus ^ OT Zei 


In the following distribution we have shaded the columns 


representing students who estimated than 
fewer/ more 
20 marbles. 
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31. 


32. 


33. 


(Continued) 


A column as high as the two shaded columns combined 
would represent the proportion of students who estimated 
fewer than marbles. This proportion would be 


In the following distribution we have shaded columns 
corresponding to people who estimated fewer than 
or more than marbles. 
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A subject who estimated 22 marbles would have made an 
error of 2 since there are actually 20 marbles in the jar. 
A subject who estimated 18 marbles would also have 
been in error by 2. The estimates indicated by the 
shaded columns in the previous distribution represent 


subjects who made an error of more than 


If the mean of the distribution of estimates was 20, an 
estimate of 22 would correspond to a positive deviation 


from the mean of 


Similarly an estimate of would correspond to a 


negative deviation of -2. 


203 


21 


18 


94€ 


(3A13 40 S31dWVS NI SNOINIdO 318VHOAV4 40 NOILYOdOYd) 
OILSILYLS W'IdlNVS 
OMe c S eg EE 


u 
e 
AONUDOTHAI 


` G 9ZIS yo səfdures 107 uorgnqrrjsrp 

Sur[dures ay} o) OT əzts yo sərdures soy uomnqı.r)sıp 
Surpdures əy} Surreduroə Aq şuəredde ysour opeur 

SI 3987 SIUL "E: gem 1972313 ojeurjso Jo 10119 ejnposqe 
ue 0} Surpeo[ o[dures v Sururejqo yo JST am pəənpəvr 
Anuəredde au am “Səfdures am yo əzis əv) Surseoxour Ag 


(N31 40 3IdWVS V NI SNOINIdO 318YM0AV4 40 NOILUOdONd) 


OLLSLLV.LS W'IdWVS 
02/67 8770 505p EE 


SW'IdWVS ASHHL JO 
AONTNOANA 


LL: sem uonuodord uorjyerndod 

əv) “rəquəməy) * — gem 1039013 9jeurnso Jo 10110 
9jnjosqe ut 0j peor pinou yey} sə[dures quəsərdər uəruu 
Suum[oo əsou) popeus oAvu oA md SuTALOTTo7 om UT 


“ETT 


“OIL 


34. 


35. 


36. 


The shaded columns in the previous distribution 
indicate the proportion of subjects whose estimates had 
either a positive deviation from the mean of 

or more, or a negative deviation of ör more. 
(VVe use the vords "or more" to indicate an estimate 
even farther away from the mean in either a positive 
or negative direction.) 


The shaded column in the following distribution 
represents the proportion of estimates that had a 
deviation from the mean estimate of 


positive/ negative 
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We will use the phrase absolute deviation or absolute 
error when we are interested only in the distance 
between a value and the mean and not in whether the 


deviation is positive or negative. 


In other words, when we are interested only in the 
difference between a value and the mean and not in 


whether this difference is positive or negative, we will 


use the phrase or 
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37. 


38. 


39. 


40. 


A value that deviates from the mean by 42 is the same 


distance away from the mean as a value having a deviation 


of -2. The value ten deviates from the value of eight by 
+2 (or simply 2). The value 6 deviates from 8 by -2. 
Both 6 and 10, however, are the same distance away 
from the value - This is what is meant when we 
say that the value 10 and the value 6 have the same 
absolute deviation from 8. 


We just used an illustration in which 100 students 

attempted to estimate the number of marbles in a jar 
containing 20 marbles. An estimate of 22 would have 
been just as accurate as an estimate of 18, since both 


estimates would have been in error by 


If the mean of the estimates was 20, an estimate of 22 
would represent a positive deviation of 2 and an estimate 
of 18 would represent a negative deviation of -2. 
Therefore, both an estimate of 22 and an estimate of 18 


would represent the same 


from the mean. 


Suppose the distribution of student estimates was as 


follows: 
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(Continued) 


The mean of this distribution is 20. Therefore, an 
estimate of 24 would represent a positive deviation of 
, and an estimate of 16 would represent a 


negative deviation of 


On the previous distribution graph, the shaded columns 
represent estimates having an absolute deviation of less 
than 


The unshaded columns in the previous graph indicate the 
proportion of estimates having an absolute deviation 


from the mean of more than 


In a bell-shaped distribution, large absolute deviations 
are frequent than small absolute deviations. 


more/ less 


The small absolute deviations are more frequent because 
most of the observed values are clustered around the 
mean in a bell-shaped distribution. Estimates much 
larger or much smaller than the mean have a 

absolute deviation and are less frequent 


large/ small 


than those values with absolute deviations . 


large/ small 


Students who estimated 25 marbles are unusual in the 
sense that few students made estimates that large. 
Similarly, students estimating only 15 marbles are also 
unusual, since there were very few estimates that small. 


In other words, the the absolute deviation 
larger/ smaller 
of a student 's estimate from the mean, the more unusual 


was his estimate. 
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48. 


In the distribution shown below, we have shaded the 
columns representing the proportions of students who 


made either unusually or unusually 
estimates. 
10. 
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ESTIMATE 


An estimate representing a large absolute deviation from 
the true number of marbles in the jar would be considered 


a estimate. 


good/ poor 


Suppose you conducted the marble estimation experiment 
with two groups of students, one group of 8-year-old 
students and another group of 18-year-old students. 
While you would expect the 18-year-old students to make 
some errors in their estimates, you would expect them 


to be more accurate than you would the younger students. 


In other words, although the older students would make 
some mistakes, you would expect the absolute deviations 
of their estimates from the true number of marbles to 


be generally than those of the younger 
larger/ smaller 


students. 
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48. 


49. 


(Continued) 


Accordingly, if the following distributions represented 
the distributions of estimates from the two groups, 


Distribution 1S pr obably the distribution of 


estimates for the younger group. 
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PROPORTION 
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DISTRIBUTION B 


PROPORTION 
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ESTIMATE 


Distribution B represents the older students ' estimates. 
It is clear that the older students tended to make more 
accurate estimates than did the younger students. The 
older students' estimates (Distribution B) tended to be 


closer to the true number of marbles than were the 
You could say that 
students ' estimates 


estimates of the younger students . 


the variability of the 
younger/ older 


was greater than that of the other group. 
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50. 


51. 


52. 


VVe could represent the difference in 

between the two distributions by the range. The range 
of Distribution A is and the range of 
Distribution B is 


Estimates of either — or would have an 
absolute deviation of 2 from the true number of marbles 
(20), since the first estimate would be 2 fewer than the 
actual value and the second estimate would be 2 greater 
than the actual value. 


The previous distributions are reproduced below. This 
time we have shaded the columns representing estimates 
whose absolute deviations from the true value of 20 were 


or more. 
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GRAPH B 
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53. 


54. 


55. 


56. 


57. 


One way of comparing the accuracy of the younger and 
older students " estimates is to compare the proportion 
of students in each group whose estimates differed from 
the true value by 2 or more. The proportion of students 
who made errors greater than one is indicated by the 
columns in the previous graphs. shaded 


shade/ unshaded 


Estimates whose absolute deviation from the true value 
of 20 were greater than one occurred more often in the 
group of students than they did in the other younger 


younger/ older 


group. 


Among the younger students, an estimate of 22 was 


unusual than an estimate of 21, since the more 
more/ less Š 
proportion of younger students who estimated 22 was 
than the proportion of younger students smaller 


greater/ smaller 


who estimated 21. 


An estimate of 22, however, was unusual for less 


“more/ less 
younger students than it was for older students, since 
the proportion of younger students who made estimates 
of 22 is than the proportion of older larger 


larger/ smaller 


students who made estimates of 22. 


Compared only with the rest of the people in his own age 


group, a young student who estimated 22 would not have 


performed quite so poorly as an older student who had 


estimated 22, since an estimate of 22 was more common 


(occurred more frequently) in the group younger 
younger/ older 


than it was in the other group - 
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58. 


59. 


Suppose you decided to give a prize to all the students 
whose estimates were within one marble of the true value. 
Or, to put it differently, suppose you decided to give a 
prize to all the students whose estimates had an absolute 


deviation from the true value of or less. 1 


The columns in the folloving graphs unshaded 


shaded/ unshaded 
indicate the proportions of students in each age group who 


would receive a prize. 
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60. 


61. 


H you only gave a prize to those younger students 
whose estimates were within 1 of the actual value, the 


unshaded columns in Graph (belov/) vrould indicate 
B 


the proportion of younger students v/ho received a prize. 
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YOUNGER STUDENTS' ESTIMATES 


GRAPH A 


PROPORTION 
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17 18 19 20 21 22 23 
YOUNGER STUDENTS' ESTIMATES 
GRAPH B 


Since the older students were more accurate in 


estimating the number of marbles, of their 
fewer/ more 


estimates would meet the requirements for a prize than 


would those of the younger students because 
more/ less 


of the older students had estimates close enough to the 


true value. 
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62. 


The unshaded columns in the folloving graphs indicate 
the proportion of older students who would receive a 
prize if we only gave a prize to students who made errors 
of or less. 


The other graph indicates the proportion of younger 
students would receive a prize if ve gave a prize to all 
younger students who made an error of or less 


in their estimates. 
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GRAPH A 
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OLDER STUDENTS" ESTIMATES 
GRAPH B 
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62. 


63. 


64. 


65. 


(Continued) 


The proportion of younger students obtaining prizes and 
the proportion of older students obtaining prizes 
be approximately the same if we used 


would/ would not 


these two rules for awarding prizes. 


We could conclude from these considerations that a 
student whose estimate was within 1 


younger/older 

of the actual value was doing just about as well in 
relation to the rest of the people in his age group as was 
a(n) student whose estimate was within 2 


younger 7 older 


of the actual value. 


By considering the difference in variability between the 
two groups, we established rules for giving prizes whereby 
approximately the same proportion of students received 
prizes in each group- Since large errors (absolute 
deviations) were more frequent in the younger group, 

a student would not be required to be 


younger/ older 


quite as accurate in order to win a prize as would a 


student in the other age group - 


The preceding example illustrates why it is often useful 
to consider the variability of a distribution v/hen you are 
evaluating a particular observed value. For 


example, imagine you were teaching a course in 


psychology. You gave your students two examinations 


during the semester. Suppose there were 10 questions 
on each test and the student received either a score of 1 


core of 0 on each question. 'The possible total 
e between and 


oras 
score on each test would be somewher 
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65. 


66. 


67. 


68. 


(Continued) 


Suppose the results of these tests were those shown in 
the following frequency tables . 


Examination A Examination B 


Score Frequency 


= HN 
pa 
E S 
Ço «qo c» -10»0 MS cə CD 
— 


According to these data, there were students in 


the class. 


There must have been 20 students in the class because 
the sum of the in each frequency 
table is 20. 


B 


be slightly larger than the variability on the other test. 
The range is one statistic which would represent this 


The variability of the scores on Test appears to 
A7 


difference in variability, since the range was on 


Test A and on Test B. 


2 
It also appears that o^ was larger on Test than on 
s A/B 


the other test. 
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69. 


70. 


71. 


"2. 


The preceding absolute frequeney distributions could be 
converted to relative frequency distributions by 
dividing each frequency by in order to convert 20 
ittoa 
2 ; - proportion 


The two proportional distributions which you would obtain 
in this manner are shown in the following tables. 


Examination A Examination B 


=E er 


0 0 
1 1 
2 2 
3 3 
4 4 
5 5 
6 6 
7 7 
8 8 
9 9 
10 10 


Notice that H of the students received scores of 
9 or lower on Examination A, whereas /20 of the 3 
students received scores of S or lower on Examination B. 


The mean of both previous distributions (Examination A 

and B)is5 ` . A score of 5 would have a negative 

deviation of - İ from the mean and a score of 6 would 
from the mean. 


vje 


have a positive deviation of 


olute deviation of 
and 5 5, 6 


The only two scores which i an abs 


i or less (from the mean of 5 D are 


One-half E of all 20 students received scores whose 


1 
absolute əsib from the mean was less than + 2 on 


Examination 
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74. 


75. 


76. 


The scores ə 6 are the only scores deviating from 
the mean by gor less on both examinations. The 
proportion of students who received scores of 5 or 6 
is equal to 

eee en 

2058 20 02:20) 
on Examination . 
B 
The proportion of students who received either a score 
of 5 or 6 on Examination B is equal to 4 ; 


which equals 


It was slightly more unusual for a subject's score to 
have an absolute deviation from the mean. of more than 


one-half on Examination than on the other 


B 
examination, since a higher proportion of students 


received grades of 5 or 6 on Examination 
B 


The absolute deviation of an observed value from the 
mean does not necessarily indicate how unusual that 
value was in relation to the rest of the distribution. If 
most of the values in the distribution were grouped very 
closely around the mean, thẹ proportion of values whose 
absolute deviation from the mean was greater than 1 
might be very small. On the other hand, if the 
distribution vere quite variable, a much higher 


proportion of the values might have absolute deviations 


from the mean greater than 1. 
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76. (Continued) 


For example, consider the two distributions shown below. 
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01:128531415/76-1—9^ 9.10 
DISTRIBUTION B 


The mean of both distributions is 5. However, the 
variability of Distribution appears to be greater 
A/B 


than the variability of the other distribution. 
TT. While the mean of both distributions is 5, absolute 
deviations greater than 1 occur much more frequently 


in Distribution than they do in the other 
/ B 


distribution. 


218 


192 


9:- 10” s[enbe gorgA ‘ae snurur I: ag prnom 
Té ejeurjjSe Jo zozzə 1noÁ ueaur pom styL * — ag 
PINOM onstmtejs ə[dures Sutpuodsaz109 ay} svatayM "1: oq 

prno^ uorjiodoid uorjep[ndod au 'sndureo SEƏSİTƏAO ƏY} 


JO IOARJ ur A[[enjo? ərəm APO juopnjs o[ou^ eu) JO 1" JI 


*Sndure2 S?as19A0 dy} JO IOAT] UT 

SEA T pue sndureo seoasoAo oy} JSUTE JE o1oA 6 YOTYM JO 
suorurdo QT Jo əqdures € urejqo pue Ásponjun eq şüörur no Á 
*eouejsut 104 -o[dures əAyyeşuəsəzdəzun ue Sururejqo yo 

AyTIqrssod ay} əpnrəədd you prnoA 4.108 sty} Jo o1npooo.id 
V uəAə JY} UETƏT)S1183S əv) o) MO 3urod uəu1 tU nox 


əures *e[dures əm) ut pəpnpour aq 03 Ajrunjroddo 
ay} peu uorurdo s,3uopnjs Laag əouts ‘suotutdo QT 


posetq /uropuea 


uropuel Jo əqdures € Sururejqo soy oxnpoooud gaq 
ppno^4 sty, 'suorurdo yuepnys QT jo əldures e quəsərdər 
pInoA srəAsue rant, + Anəəsip ürəy) 3urj9ejuoo Aq 
suorurdo , sjuepnjs QT əsəu1 urejqo prno? nox -sjuepnis 
OT 5ur4ynuəpi zəded jo sdr[s ot mo Meap Drëtte pInoA 
nof “ərdurexə 107] “suoruldo , syuepnys ot Jo Sun3sisuoə 
ərdures € urejqo o) paguem nod yi “pəlqurerəs əzəA əpisur 
səded jo sdi[s ay} [I? yey} os uəyeys ATYSno.10y} əq pinoA 
jexseq oy} ‘UAL -3exseq are * ur pəəefd aq pinoə 
sdrIs əsəu1 yo ITE pue 1eded yo drs o3e1edos z uo uyam 
aq prnoa sriequinu əsəu1 yo YILA “000 ‘OT 0) I Dot 
səqumu şuəsəyyip € peuSrss? oq pinoA ÁAjrseArun am 1? 
perporue quopnjs AIBA -*310S SUTAOT107 əy} JO ərnpəoo:rd 
Sonde wopuez e osodoud 3u8rur ueronsnejs aur 


"G9 


"v9 


78. 


79. 
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While the value 7 represents a deviation from the mean 
of in either distribution, values with deviations 
that large or larger were more unusual in Distribution 


than they were in the other distribution. 
A/B 


Statisticians describe an observed value in a way that 
takes into account the variability of the distribution. You 
saw earlier how an observed value could be represented 
by its deviation from the mean rather than by its actual 
value. In addition, it is sometimes useful to indicate 
the relationship between that deviation and the standard 
deviation of the distribution. For example, if the mean 
of a particular distribution were 10, the value 15 would 
have a deviation from the mean of ` -. If the standard 
deviation of the distribution were 5, you could say the 
value 15 deviates from the mean by exactly one standard 
deviation. Similiarly, since the value 20 has a deviation 
from the mean of 10, 20 would be exactly 


two/three 


standard deviations away from the mean if o = 5. 


The variance of a distribution is simply the typical 


(mean) squared deviation from the mean of that distribution. 


The standard deviation squared would equal the 
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82. 


83. 


84. 


Consider the following relative frequency distribution. 


VALUE PROPORTION 


It can be shown that the variance of this distribution is 


16. Therefore, the is equal 


to A/A6 or 4. 


The mean is also 4. Thus, the values 3 and 5 


are/are not 


more than one standard deviation away from the mean 
because the absolute deviation from the mean of 4 for 


both the value 3 and the value 5 is 


You could say the value 5 is i of a standard deviation 
away from the mean, since the value 5 represents a 
deviation from the mean of 1 and 1 is i the size of the 
standard deviation 4. Similiarly, the value 6 would be 


of a standard deviation from the mean, since 
SESCH 


4 z 
the value 6 deviates from the mean by 1 and 1 is 


the size of the standard deviation 4. 


We have described the distance or difference between a 
particular value and its mean as the 


of that value from that mean. 
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86. 


(Continued) 


If a value is smaller than the mean, you say that it has a 
deviation. If a value is larger than the 


positive/ negative 


mean, that value has a deviation. 


positive/ negative 


If the distribution had a mean of 6, the value 7 would 
represent a deviation of and the value 5 would 


represent a deviation of 


Suppose the distribution of your data were approximately 
"bell shaped." You would know that most of the observed 
values were clustered around the mean with progressively 
fewer and fewer observed values farther away from the 
mean. Inother words, the frequency of values having a 
deviation of 2 is probably greater than the frequency of 


values with a deviation of 
sel 


The values farthest away from the mean in a "bell- 
shaped" distribution are often referred to as the tails 
of the distribution(since the distribution appears to 
taper into a tail at the extreme values. For example, 
the areas in the following distribution 


shaded/ unshaded 


would be referred to as the tails of the distribution. 


FREQUENCY 


VALUES 
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87. 


88. 


89. 


90. 


91. 


92. 


The values from the absolute 
largest/ smallest 

deviations from the mean appear in the tails of a bell- 

shaped distribution. 


m a so-called "bell-shaped" distribution, the farther a 
value is from the mean (the more it deviates from the 
mean), the will be its frequency. 


larger/smaller 


You should recall that the standard deviation (c) is a 
deviation which, squared, would equal the 
If the variance of a distribution were 9, a deviation of 4 


would be than one standard deviation. 
greater/ less 


If the mean of your distribution vere 10 and the standard 
deviation of the distribution vere 4, a score of 14 would 


have a deviation from the mean equal to standard 
1/72 


deviation(s) - 


Another way of indicating that a particular value deviates 
from the mean by one standard deviation is to say that 
that value equals a standard score of one. Thus, if the 
value 14 represents a standard score of one ina 
distribution with a mean of 10, the standard deviation of 


the distribution must be 


If the distribution had a standard deviation of 2 and a 
mean of 10, then the value would deviate from 
ean by one standard deviation. Therefore, 12 

of one. 


the m 


would represent as s 
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largest 


smaller 


variance 


greater 


12 


standard score 
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94. 


95. 


96. 


If a particular value were said to equal a standard score 
of 2, that value would have a deviation from the mean 
equal to twice the standard deviation of the distribution. 
If a particular value equaled a standard score of -2, that 
value would represent a negative deviation from the mean 


equal in size to twice the 


Below are a list of values forming a distribution whose 
mean is 10 and whose variance is 25. Since the variance 
is 25, the size of a standard deviation is . 


STANDARD 
VALUE DEVIATION FROM 10 SCORE 
15 
5 
10 


1 
“hi 
0 


Notice that the first value, 15, deviates from the mean 
of 10 by . Since the standard deviation of the 


distribution is 5, a deviation of 5 would be the same size 


as one standard deviation. This is all ve mean when we 


indicate (as we did in the previous table) that the value 
15 is equal to a standard score of . 


The value 5 in the previous distribution represents a 
negative deviation from the mean of -5 and, therefore, 


is equivalent to a standard score of 


"ues 
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standard deviation 
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Most of the values in a normal distribution are clustered 
around the mean, with fewer and fewer values farther 
away from the mean (i.e., the distribution is "bell- 
shaped"). Thus, values representing a Z-score larger 


than 2 would be frequent than values 
more/ less 


representing Z -scores less than 2. 


You have already seen that .95 of all the values in a 
normal distribution are within two standard deviations 
of the mean. Therefore, .95 of all the values in a 
normal distribution would represent Z -scores between 
-2 and 


Thus, a Z-score is a useful way of representing values 
in a normal distribution since it indicates how unusual 
such values are, regardless of the mean or variance of 
the distribution. No matter what the mean or variance 
of the normal distribution, you would know that a 
Z-score as large as 1 was frequent than a 


more/ less 


Z -score as large as 3. 


Suppose you were told that your score on the last 
Psychology examination vas approximately equal to a 
Z-score of two. This would imply that the distribution 
imately normal and that your 
standard deviations above 


of test scores was approx 
score was about 
the mean of the distribution. 


Furthermore, since only .05 of the values in a normal 


distribution are farther than two standard deviations 


m the mean, and since these extreme values are 
sitive deviations, 


of the students 


fro 
divided equally between negative and po: 


know that only about 
you would kno 057-025 
grade on the test. 


had made a 
better/ poorer 240 
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To summarize then, a standard score (or Z-scores if the 
distribution is normal) are a convenient way of representing 
a value, since it indicates how many standard 


that value is away from the mean. deviations 


This is particularly useful to you in the case of a normal 

distribution, since you know exactly what proportion 
of the values in the distribution are within any 

particular number of standard deviations from the mean. 
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Section VI: Samples and Populations 


There are some things you can say about a collection of 

data even before you actually collect it. Suppose you 

were interested, for example, in the heights of students 

at a particular high school. You could determine the 

distribution of these heights by measuring and recording 

the height of each student in the high school. If there 

were 1, 500 students, your collection of data would 

consist of observations of a variable called 1, 500 
"height." 


Suppose you only knew the heights of ten of the 1, 500 

students. Although these ten observations form a collection 

of data, they could also be considered as part of the 

larger, complete collection of data. Statisticians use 

the name sample to describe a collection of data which is 

viewed as part of a larger, complete collection of data. 

In other words, the collection of heights 10 


10/1, 500 


would be considered a sample in this illustration. 


Statisticians refer to the complete collection of data. 
(of which the sample is a part) as a population. Thus, 
the heights of the ten students would be considered a 


sample, whereas the heights of the 1, 500 students would 


be considered a . population 
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Let's look at another example of the difference between 

a sample and a population. Suppose you wanted to know 

Which of two candidates for some public office was most 

preferred by each of the 10, 000 people in your city. If 

you asked the first 100 people you met on the street to 

state their preference, you would have a sample consisting 

of observations from the complete collection of 100 
data (population) consisting of observations. 10, 000 


Suppose you were interested in the yearly income of the 
200 school teachers in your city. A collection of data 
consisting of yearly incomes of only 5 of these teachers 


would be a if you viewed it as part sample 
sample/ population 
of tne consisting of the yearly incomes population 


sample/ population 


of all 200 teachers. 


It is important to realize that a particular collection of 
data could be treated as either a sample or population 
depending upon how you view it. Inthe previous 
illustration, for example, the complete collection of data 
consisting of the yearly income of each of the 200 
teachers in your city was viewed as a population. Suppose, 
however, you were interested in the yearly incomes 

of all the teachers in your state. A complete collection 
of data would now consist of a list of yearly incomes of 


all the teachers in the state. 


This list would now be the population, whereas the list 
the 200 teachers in your city now 


of yearly incomes for 
from this sample 


could be viewed as a 
sample/ population 


population. 
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10. 


Notice also that the yearly income for any 5 particular 


teachers in your city be viewed as 
could/ could not 


a sample from either a population consisting of the 
incomes of all the teachers in your city, or the larger 
population consisting of the yearly incomes of all the 
teachers in your state. 


A particular collection of data may be viewed as a 


sample/ population 


to a larger collection of data of which it is a part, or as 


if you are comparing that collection 


a if you are comparing it to a 
sample/ population 


smaller collection of data which would be included in it. 


To define a population, you coüld describe what would be 


included in the complete collection of data. For example, 


you could describe a collection of data consisting of the 
ages of all the Democratic presidents. You could 
describe another collection of data consisting of the ages 


of all the Republican presidents. 


Either one of these collections of data could be 
considered a population, but they be the 


would/ would not 


same population(s) - 


A particular collection of data may be a sample from 
more than one population. For example, a list of the 
ages of all the truck drivers in Detroit could be a 

s from the population consisting of the ages 
of all the truck drivers in the United States. The same 


collection of data could be considered a 
from a population consisting of all the male workers in 


Detroit. 
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13. 


The ages of all the female school teachers in Detroit 


would/ would not 
consisting of the ages of all the male drivers in the 
United States. However, the ages of all the female 


School teachers in Detroit be a sample would 
would/ would not 


be a sample from a population would not 


from a population consisting of the ages of all the people 
employed in Detroit. 


Up to this point, we have only considered populations as 
collections of data that could actually be completely 
collected. For example, we discussed a population 
consisting of the heights of all the students in a 
particular high school. In principle, at least, it 
be possible to actually collect a would 


would/ would not 


complete list of the heights of all the students in the high 


school. 


It is sometimes useful to think of a particular collection 
of data ag if it vere a sample from a larger collection 

of data consisting of an unlimited number of observations. 
An example of a population consisting of an unlimited 
number of observations would be a list of the outcomes 

of an unlimited number of throws of a die (one member 


of a pair of dice). You could roll a die and record the 


number of dots showing on the face of the die as the 
"outcome" of this toss. You could then roll it again, 
and again, and again, each time recording the number 
of dots showing on the face ofthe die. Inthis way, you 


could go on producing a list of observed values without 


ever specifying where you should stop. A list of out- 


comes for any specific group of tosses could be viewed 


from a longer, unlimited list of sample 


asa 
outcomes, since this smaller collection of data would 
be part of the larger, unlimited collection. 
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14. 


15. 


16. 


If a population consists of a specific number of 
observations, it is called a finite population. The hair 
color of every person in the United States is a finite 
population since, in principle at least, you could actually 


list the hair color or every person in the United States. 


Similarly, since there are a limited number of people in 
the world at any one time, a list of the heights of everyone 


in the world would consist of a number limited 


limited/ unlimited 


of observations and would therefore be a finite population. 


A population viewed as consisting of an unlimited number 

of observations is referred to as an infinite population. 

A population consisting of the yearly income of each 

person in a particular city be an infinite would not 


would/ would not 
population, since the number of people in the city is 


limited (in principle, you could actually count them all). 


Suppose you shuffled a deck of playing cards, dealt out 
five cards, and counted the number of red cards among 
the five. You could consider this number to be a single 
observation of a variable whose 6 possible values 
are: 0, 1, 2, 3, 4, and 5. You could shuffle the cards 

and count the number of 


again, deal out five more cards, 
red cards in this new group of five. 


this process indefinitely. (The procedure for generating 
ns does not specify any end to the 


You could continue 


this list of observatio 
list.) In other vords, the possible number of 


observations would be . Any specific unlimited 
limited/ unlimited 
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16. 


17. 


18. 


19. 


(Continued) 


group of observations, therefore, could be considered to 


be a sample from a(n) population consisting 
finite/ infinite 


of the unlimited number of observations. 


We have seen that every collection of data describes a 
distribution. The distribution can either be described in 
absolute terms, by listing the frequency with which each 
value occurred in the data, or it can be described in 
relative terms, by listing the p of times 
each value occurred in the data. 


A distribution can be described by statistics such as the 
mean, the median, and the mode, all of which 

t ^ 
or a distribution can be described by statistics such as 


characterize its c 


the range and the variance, which describe its 


Vv 


While you may be interested in the distribution of a 
particular collection of data, for one reason or another 
you will often have only part of that complete collection. 
In other words, although you are really interested in a 


you may have only a 
population/ sample 
population/ sample 
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central tendency 


variability 


population 
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Suppose, for example, you were interested in the 

number of people in the United States with a college 

education. It would probably be too difficult, too time- 

consuming, and too expensive to actually list whether or 

not each person in the United States held a college 

degree. Suppose, however, you stopped twenty people 

on a street corner and asked them if they held a college 

degree. The data you collected vould be only part of the 

data in which you vere interested. Therefore, the 

twenty observations would be a from the sample 
consisting of the education of all population 


21. 


22. 


the people in the United States . 


Since there is a specific number of people in the United 
States, your list of twenty observations would be a 


sample from a population. 


finite/ infinite 


Suppose you found that four of the twenty people you had 
stopped on the street did hold college degrees. You 
could describe the distribution of your sample by the 


absolute frequency distribution shown in Graph 
so eq y 5 ; 


or by the relative frequency distribution shown in 


Graph 
A/B 


PROPORTION 


FREQUENCY 
o ltz oj oc» o e 


20 
18 
16 
14 
12 
10 
8 
6 
4 
2 
0 


Degree No Degree No 
= Degree Degree 


GRAPH A GRAPH B 
248 


finite 


TEZ 


*(uqəxpunu 3sexeou ay} o) JJO 

pəpunor uəəq JALY JAOQE Ə1Q82 ƏY} ut suorj1odoad eu) 

I -weour əu yo uoreIA9p piepuejs — Ota eie 
uonnqışsıp jo əd2) Sty} UT sonpeA əv) [€ JO 89: FY} IION 


dary} ucu) ssər 
OM} ULY} ssə'T 
əuo ucu) ssə'T 


suorneA1esqo pat ey} (DOIT 
Jo SUOTETAƏQ prepuejs 
uorjrodoaq Jo 1equinN 


*Jouueur STU} ur poqriosep oq ued “opnqlz1s1p 

Teunrou ^uorjnqriujsrp jo əd4) quej10dur Area e “ə[durexə 
104g ‘Uva ən) yo suomnerAop prepuejs yo s1oquinu 
quo1ogyrp OUT senjea yo uorj.rodo.rd əy} 3urjeorput Aq 
uonnqrışsıp v jo edes əv) eqrrosep uo3jo sueronstaejs 


z *ueəur aq) JO suorerAəp priepuejs UTY}IM aie 
SƏnTEA ay} JO UE “TƏAƏAOH -utoUui BY} WOT] SuorjerAop 
pepueşs š [ ue ssər san[ea opnjout 03 pəpuəşxə 


SEA PWH IY} uoqA papn[our ƏəzƏA San[eA TEUOHİPPE ON 


*ueaul əy} yo 
I uorjerAop piepuejs UTU?LA 919A San[eA peAresqo 
8 eu Jo Sen - Sojeorpur MOI puoəəs AL 


“Uva əv) yo uorgerAap 

2/1 prepurjs e Wy OOM SENTRA oui JO OT bi 
Sejeorpur MOI JSI] əm) "ent, -ueour am uro1j feme 

sdəşs uorjeraep piepuejs z /T ut pəəəoud nod se pəpnrəur 

91? SuOT)EATəSqo ərour ULU MOY Sojvorpur əlqe) stu, 


(penuruo)) 


"821 


“LOT 


23. 


24. 


26. 
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If you felt that the sample of 20 people was typical or 
representative of the whole population, you might guess 
that the proportion of people in the United States with 
college degrees was about 


In other words, while you don 't actually have a complete 
collection of data, you might feel that the sample is 
similar to the 


A guess at some statistic describing the population on 
the basis of a sample from that population is called a 
statistical inference. You made a 

about the proportion of college 
graduates in the United States on the basis of a sample 
of 20 people. 


A statistic describing a population is often called a 
population parameter. For example, if you had a 
complete collection of data listing whether or not each 
person in the United States had a college degree, you 
could calculate the true proportion of people with 
college degrees in'this population. Since the proportion 
would be a statistic describing a population, it would be 


called a population p . 


Any statistic describing a sample is called a sample 
statistic. Thus, the proportion of the people with 
degrees in your sample would bea 


sample statistic/ population parameter 
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30. 


Suppose you were a shoe manufacturer who had to decide 


what proportion of his total shoe production should be 
devoted to each possible shoe size. You would want the 
distribution of shoe sizes you produced to be as close as 
possible to the actual distribution of shoe sizes required 
by the people who could purchase your shoes, since it 
would not be efficient for you to produce more shoes 
then you could sell in one size and fewer shoes then 
could be sold in another size. In principle at least, you 
could measure the shoe size of every potential customer 
and thereby determine the actual distribution of shoe 
sizes in the population. It probably would be out of the 
question, however, to actually determine the shoe sizes 
of all these people. Suppose you were able to obtain the 
shoe sizes of 100 potential customers. You might 
guess something about the distribution of the 

based on the distribution of the 


sample/ population 
you had obtained. 


sample/ population 


In other words, on the basis of a sample of 100 shoe 
sizes from the population, you would be making a 
statistical in concerning the 
distribution of shoe sizes in the population of potential 


customers. 


The difference between the largest and the smallest shoe 
size in the sample of 100 shoe sizes would be the 


of the sample. 
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35. 


The difference between the largest and the smallest 
shoe size in the population of all potential customers 


is a statistic that describes the and 


population/ sample 


would, therefore, be calledap ` statistic . 


The variance of your sample vould be a 
sample statistic/ population parameter 


If you had a complete collection of data concerning the 
Shoe sizes of all potential customers, you could actually 
calculate the true variance of the population. This 
variance would be a population statistic or p 


Suppose you guessed that the population (parametric) 
mean was the same as the mean of your sample. You 


would be using a statistic as an 
population/ sample 
estimate of a statistic . 


population/ sample 


You would be making an inference about the 


on the basis of a . The difference between 


your estimate of the population mean and the true 
population mean would be your error of estimate. 


In other words, if the true mean were 10 and your estimate 


were 9,the error of estimate would be 10 - EE 
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Consider the two collections of data shown belov: 
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Observation Value 
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TABLE B 
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TABLE A 


Suppose the data in Table A represented a population and 
the data in Table B represented a sample from that 
population. The mode of the population is ` 
whereas the mode of the sample is 


If you used the sample mode as an estimate of the 
population mode, your error of estimate would equal 


20 - WOR 


If the sample were typical, or representative, of the 
population, the difference between a sample statistic 
and the population statistic would be small. In other 
words, your e of e would 
be small if you used this sample statistic as your 


estimate of the population statistic . 


However, the sample statistic might be quite different 
from the population statistic and the error of estimate 


large/ small 


would then be 
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40. Some samples you obtained could be quite representative 
Of the population because the sample statistic vas very 
similar to the population statistic. However, you could 
also obtain samples which gave you a poor or distorted 
picture of the population. It would be useful to know how 
often different values of the sample statistic would occur 
if you collected one sample after another. Suppose there 
were 1000 students at the high school and you knew the 


height of five particular students. The collection of 


whereas 
sample/ population 


the collection of five heights would be a . 
sample/ population 


1000 heights would be your population 


sample 


41. Suppose you had twenty such samples from this 
population, where each sample was a collection of the 
heights of five students. If you calculated the mean of 
each of these samples the list of these twenty means 


would be a collection of twenty sample 
sample/ population 


statistics. 


42. The frequencies with which each possible value of the 
sample mean occurred in this collection of twenty 
sample statistics define a distribution of sample means. 
The distribution of these twenty sample means is called 
a sampling distribution, since it is a distribution of 


sample statistics. 


Instead of calculating the mean of each of the twenty 
samples, you could calculate the variance of each 


sample. The list of twenty sample variances would 


also be a collection of twenty statistics. sample 


Therefore, the distribution of these twenty sample 


variances would be a distribution. sampling 
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43. 


44. 


45. 


46. 


Similarly, you could calculate the median, the mode, the 


range, or the standard error (or any other sample 
statistic) of these twenty samples. In each case, the 
distribution of the resulting twenty sample statistics 
would be called a s distribution. 


H you collected several samples from some population 
and calculated the mean of each sample, the following 
graph might be the frequency distribution of these 
sample means and would therefore be a 

distribution of sample means. 
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According to this.graph, there vere samples, 


10/20 


since you calculated a mean for each sample. 


The largest (not the most frequent) sample mean was 
and the smallest sample mean was 


The most frequently occurring sample mean was 8 
since of the twenty samples had that mean. 


Therefore, the modal value of the sample mean Was — - 
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47. 


48. 


Suppose you were interested in a population consisting 

Of the yearly incomes of all of the people in a particular 

city. If you determined the yearly income of ten people 

you stopped on the street in that city, the resulting list 

Of ten incomes would be considered a İrom sample 
that population. 


Imagine you collected twenty samples,where each 
sample consisted of a list of the yearly incomes (to the 
nearest thousand dollars) of ten people. Furthermore, 
imagine you collected these twenty samples by stopping 
people in the lobby of the most expensive hotel in town. 
Suppose you collected another 20 samples on a street 
corner in the poorest section of town. The graphs 
shown below could indicate the distribution of sample 


means in each group of twenty samples. 
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RESPONSE MASK 


Tear this mask off and use 

it to cover the responses on the 
right-hand side of each page as 
you work through this programmed 


text. 


Study the first frame (statement). 
Then write your response in the 
space provided or on a separate 
sheet of paper. NOW move the 
mask down to reveal the printed 
(correct) response and check your 
answer against it. Remember to 


keep the other responses covered. 


Repeat this procedure for each 
frame. When you have completed 
a page, slip this mask over the 
responses on the next page and 


continue your studies. 
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